Editing Dirac equation (section)

== History ==
The Dirac equation in the form originally proposed by [[Paul Dirac|Dirac]] is:<ref>{{Cite book |last=Pais |first=Abraham |title=Inward bound: of matter and forces in the physical world |date=2002 |publisher=Clarendon Press [u.a.] |isbn=978-0-19-851997-3 |edition=Reprint |location=Oxford}}</ref>{{rp|291}}<ref>{{cite book |last=Dirac |first=Paul A.M. |title=Principles of Quantum Mechanics |edition=4th |page=255 |publisher=Oxford University Press |series=International Series of Monographs on Physics |orig-year=1958 |year=1982 |isbn=978-0-19-852011-5}}</ref>
<math display="block">\left(\beta mc^2 + c \sum_{n = 1}^{3}\alpha_n p_n\right) \psi (x,t) = i \hbar \frac{\partial\psi(x,t) }{\partial t} </math>
where {{math|''ψ''(''x'', ''t'')}} is the [[wave function]] for an [[electron]] of [[rest mass]] {{math|''m''}} with [[spacetime]] coordinates {{math|''x'', ''t''}}. {{math|''p''<sub>1</sub>, ''p''<sub>2</sub>, ''p''<sub>3</sub>}} are the components of the [[momentum]], understood to be the [[momentum operator]] in the [[Schrödinger equation]]. {{math|''c''}} is the [[speed of light]], and {{math|''ħ''}} is the [[reduced Planck constant]]; these fundamental [[physical constant]]s reflect special relativity and quantum mechanics, respectively. {{math|''α''<sub>''n''</sub>}} and {{math|''β''}} are {{nowrap|4 × 4}} [[gamma matrices]].

Dirac's purpose in casting this equation was to explain the behavior of the relativistically moving electron, thus allowing the atom to be treated in a manner consistent with relativity. He hoped that the corrections introduced this way might have a bearing on the problem of [[atomic spectra]].

Up until that time, attempts to make the old quantum theory of the atom compatible with the theory of relativity—which were based on discretizing the [[angular momentum]] stored in the electron's possibly non-circular orbit of the [[atomic nucleus]]—had failed, and the new quantum mechanics of [[Werner Heisenberg|Heisenberg]], [[Wolfgang Pauli|Pauli]], [[Pascual Jordan|Jordan]], [[Erwin Schrödinger|Schrödinger]], and Dirac himself had not developed sufficiently to treat this problem. Although Dirac's original intentions were satisfied, his equation had far deeper implications for the structure of matter and introduced new mathematical classes of objects that are now essential elements of fundamental physics.

The new elements in this equation are the four {{nowrap|4 × 4}} [[matrix (mathematics)|matrices]] {{math|''α''<sub>1</sub>}}, {{math|''α''<sub>2</sub>}}, {{math|''α''<sub>3</sub>}} and {{math|''β''}}, and the four-component [[wave function]] {{math|''ψ''}}. There are four components in {{math|''ψ''}} because the evaluation of it at any given point in configuration space is a [[bispinor]]. It is interpreted as a superposition of a [[Spin-1/2|spin-up]] electron, a spin-down electron, a spin-up positron, and a spin-down positron.

The {{nowrap|4 × 4}} matrices {{math|''α''<sub>''k''</sub>}} and {{math|''β''}} are all [[Hermitian matrix|Hermitian]] and are [[involutory matrix|involutory]]:
<math display="block">\alpha_i^2 = \beta^2 = I_4</math>
and they all mutually [[anticommute|anti-commute]]:
<math display="block">\begin{align}
  \alpha_i\alpha_j + \alpha_j\alpha_i &= 0\quad(i \neq j) \\
        \alpha_i\beta + \beta\alpha_i &= 0
\end{align}</math>

These matrices and the form of the wave function have a deep mathematical significance. The algebraic structure represented by the [[gamma matrices]] had been created some 50&nbsp;years earlier by the English mathematician [[William Kingdon Clifford|W. K. Clifford]]. In turn, Clifford's ideas had emerged from the mid-19th-century work of German mathematician [[Hermann Grassmann]] in his ''Lineare Ausdehnungslehre'' (''Theory of Linear Expansion'').{{citation needed |reason=Historical claims need refs |date=July 2024}}

{{clear}}

=== Making the Schrödinger equation relativistic ===
The Dirac equation is superficially similar to the Schrödinger equation for a massive [[free particle]]:
<math display="block">-\frac{\hbar^2}{2m}\nabla^2\phi = i\hbar\frac{\partial}{\partial t}\phi ~.</math>

The left side represents the square of the momentum operator divided by twice the mass, which is the non-relativistic kinetic energy. Because relativity treats space and time as a whole, a relativistic generalization of this equation requires that space and time derivatives must enter symmetrically as they do in the [[Maxwell equation]]s that govern the behavior of light – the equations must be differentially of the ''same order'' in space and time. In relativity, the momentum and the energies are the space and time parts of a spacetime vector, the [[four-momentum]], and they are related by the relativistically invariant relation
<math display="block">E^2 = m^2c^4 + p^2c^2 ,</math>
which says that the [[Four-momentum#Minkowski norm|length of this four-vector]] is proportional to the rest mass {{math|''m''}}. Substituting the operator equivalents of the energy and momentum from the Schrödinger theory produces the [[Klein–Gordon equation]] describing the propagation of waves, constructed from relativistically invariant objects,
<math display="block">\left(-\frac{1}{c^2}\frac{\partial^2}{\partial t^2} + \nabla^2\right)\phi = \frac{m^2c^2}{\hbar^2}\phi ,</math>
with the wave function <math>\phi</math> being a relativistic scalar: a complex number that has the same numerical value in all frames of reference. Space and time derivatives both enter to second order. This has a telling consequence for the interpretation of the equation. Because the equation is second order in the time derivative, one must specify initial values both of the wave function itself and of its first time-derivative in order to solve definite problems. Since both may be specified more or less arbitrarily, the wave function cannot maintain its former role of determining the [[probability density function|probability density]] of finding the electron in a given state of motion. In the Schrödinger theory, the probability density is given by the positive definite expression
<math display="block">\rho = \phi^*\phi </math>
and this density is convected according to the probability current vector
<math display="block">J = -\frac{i\hbar}{2m}(\phi^*\nabla\phi - \phi\nabla\phi^*) </math>
with the conservation of probability current and density following from the continuity equation:
<math display="block">\nabla\cdot J + \frac{\partial\rho}{\partial t} = 0~.</math>

The fact that the density is [[Positive-definite function|positive definite]] and convected according to this continuity equation implies that one may integrate the density over a certain domain and set the total to 1, and this condition will be maintained by the [[conservation law]]. A proper relativistic theory with a probability density current must also share this feature. To maintain the notion of a convected density, one must generalize the Schrödinger expression of the density and current so that space and time derivatives again enter symmetrically in relation to the scalar wave function. The Schrödinger expression can be kept for the current, but the probability density must be replaced by the symmetrically formed expression{{explain|reason=Why?|date=November 2021}}
<math display="block">\rho = \frac{i\hbar}{2mc^2} \left(\psi^*\partial_t\psi - \psi\partial_t\psi^* \right) ,</math>
which now becomes the 4th component of a spacetime vector, and the entire [[Probability current|probability 4-current density]] has the relativistically covariant expression
<math display="block">J^\mu = \frac{i\hbar}{2m} \left(\psi^*\partial^\mu\psi - \psi\partial^\mu\psi^* \right) .</math>

The continuity equation is as before. Everything is compatible with relativity now, but the expression for the density is no longer positive definite; the initial values of both {{math|''ψ''}} and {{math|∂<sub>''t''</sub>''ψ''}} may be freely chosen, and the density may thus become negative, something that is impossible for a legitimate probability density. Thus, one cannot get a simple generalization of the Schrödinger equation under the naive assumption that the wave function is a relativistic scalar, and the equation it satisfies, second order in time.

Although it is not a successful relativistic generalization of the Schrödinger equation, this equation is resurrected in the context of [[quantum field theory]], where it is known as the [[Klein–Gordon equation]], and describes a spinless particle field (e.g. [[pi meson]] or [[Higgs boson]]). Historically, Schrödinger himself arrived at this equation before the one that bears his name but soon discarded it. In the context of quantum field theory, the indefinite density is understood to correspond to the ''charge'' density, which can be positive or negative, and not the probability density.

=== Dirac's coup ===
Dirac thus thought to try an equation that was ''first order'' in both space and time. He postulated an equation of the form
<math display="block">E\psi = (\vec{\alpha} \cdot \vec{p} + \beta m) \psi</math>
where the operators <math>(\vec{\alpha}, \beta)</math> must be independent of <math>(\vec{p}, t)</math> for linearity and independent of <math>(\vec{x}, t)</math> for space-time homogeneity. These constraints implied additional dynamical variables that the <math>(\vec{\alpha}, \beta)</math> operators will depend upon; from this requirement Dirac concluded that the operators would depend upon {{nowrap|4 × 4}} matrices, related to the Pauli matrices.<ref>{{Cite book |last1=Duck |first1=Ian |url=https://www.worldscientific.com/worldscibooks/10.1142/3457 |title=Pauli and the Spin-Statistics Theorem |last2=Sudarshan |first2=E C G |date=1998 |publisher=WORLD SCIENTIFIC |isbn=978-981-02-3114-9 |language=en |doi=10.1142/3457}}</ref>{{rp|205}}

One could, for example, formally (i.e. by [[abuse of notation]], since it is not straightforward to take a [[functional square root]] of the sum of two differential operators) take the [[Energy–momentum relation|relativistic expression for the energy]]
<math display="block">E = c \sqrt{p^2 + m^2c^2} ~,</math>
replace {{math|''p''}} by its operator equivalent, expand the square root in an [[infinite series]] of derivative operators, set up an eigenvalue problem, then solve the equation formally by iterations. Most physicists had little faith in such a process, even if it were technically possible.

As the story goes, Dirac was staring into the fireplace at Cambridge, pondering this problem, when he hit upon the idea of taking the square root of the wave operator (see also [[half derivative]]) thus:
<math display="block">\nabla^2 - \frac{1}{c^2}\frac{\partial^2}{\partial t^2} = \left(A \partial_x + B \partial_y + C \partial_z + \frac{i}{c}D \partial_t\right)\left(A \partial_x + B \partial_y + C \partial_z + \frac{i}{c}D \partial_t\right)~.</math>

On multiplying out the right side it is apparent that, in order to get all the cross-terms such as {{math|∂<sub>''x''</sub>∂<sub>''y''</sub>}} to vanish, one must assume
<math display="block">AB + BA = 0, ~ \ldots ~</math>
with
<math display="block">A^2 = B^2 = \dots = 1~.</math>

Dirac, who had just then been intensely involved with working out the foundations of Heisenberg's [[matrix mechanics]], immediately understood that these conditions could be met if {{math|''A''}}, {{math|''B''}}, {{math|''C''}} and {{math|''D''}} are ''matrices'', with the implication that the wave function has ''multiple components''. This immediately explained the appearance of two-component wave functions in Pauli's phenomenological theory of [[Spin (physics)|spin]], something that up until then had been regarded as mysterious, even to Pauli himself. However, one needs at least {{nowrap|4 × 4}} matrices to set up a system with the properties required – so the wave function had ''four'' components, not two, as in the Pauli theory, or one, as in the bare Schrödinger theory. The four-component wave function represents a new class of mathematical object in physical theories that makes its first appearance here.

Given the factorization in terms of these matrices, one can now write down immediately an equation
<math display="block">\left(A\partial_x + B\partial_y + C\partial_z + \frac{i}{c}D\partial_t\right)\psi = \kappa\psi </math>
with <math>\kappa</math> to be determined. Applying again the matrix operator on both sides yields
<math display="block">\left(\nabla^2 - \frac{1}{c^2}\partial_t^2\right)\psi = \kappa^2\psi ~.</math>

Taking <math>\kappa = \tfrac{mc}{\hbar}</math> shows that all the components of the wave function ''individually'' satisfy the relativistic energy–momentum relation. Thus the sought-for equation that is first-order in both space and time is
<math display="block">\left(A\partial_x + B\partial_y + C\partial_z + \frac{i}{c}D\partial_t - \frac{mc}{\hbar}\right)\psi = 0 ~.</math>

Setting
<math display="block">A = i \beta \alpha_1 \, , \, B = i \beta \alpha_2 \, , \, C = i \beta \alpha_3 \, , \, D = \beta ~, </math>
and because <math>D^2 = \beta^2 = I_4 </math>, the Dirac equation is produced as written above.

=== Covariant form and relativistic invariance ===
To demonstrate the [[Lorentz covariance|relativistic invariance]] of the equation, it is advantageous to cast it into a form in which the space and time derivatives appear on an equal footing. New matrices are introduced as follows:
<math display="block">\begin{align}
  D &=   \gamma^0, \\
  A &= i \gamma^1,\quad B = i \gamma^2,\quad C = i \gamma^3,
\end{align}</math>
and the equation takes the form (remembering the definition of the covariant components of the [[4-gradient]] and especially that {{math|1=∂<sub>0</sub> = {{sfrac|1|''c''}}∂<sub>''t''</sub>}})

{{Equation box 1
|title='''Dirac equation'''
|indent=:
|equation = <math>(i \hbar \gamma^\mu \partial_\mu - m c) \psi = 0</math>
|border
|border colour =#50C878
|background colour = #ECFCF4
}}

where there is an [[Einstein notation|implied summation]] over the values of the twice-repeated index {{math|''μ'' {{=}} 0, 1, 2, 3}}, and {{math|∂<sub>''μ''</sub>}} is the [[4-gradient]]. In practice one often writes the [[gamma matrices]] in terms of 2 × 2 sub-matrices taken from the [[Pauli matrices]] and the 2 × 2 [[identity matrix]]. Explicitly the [[gamma matrices#Dirac basis|standard representation]] is
<math display="block">
\gamma^0 = \begin{pmatrix} I_2 &        0 \\         0 & -I_2 \end{pmatrix},\quad
\gamma^1 = \begin{pmatrix}   0 & \sigma_x \\ -\sigma_x &    0 \end{pmatrix},\quad
\gamma^2 = \begin{pmatrix}   0 & \sigma_y \\ -\sigma_y &    0 \end{pmatrix},\quad
\gamma^3 = \begin{pmatrix}   0 & \sigma_z \\ -\sigma_z &    0 \end{pmatrix}.
</math>

The complete system is summarized using the [[Minkowski metric]] on spacetime in the form
<math display="block">\left\{\gamma^\mu, \gamma^\nu\right\} = 2 \eta^{\mu\nu} I_4</math>
where the bracket expression
<math display="block">\{a, b\} = ab + ba</math>
denotes the [[anticommutator]]. These are the defining relations of a [[Clifford algebra]] over a pseudo-orthogonal 4-dimensional space with [[metric signature]] {{math|(+ − − −)}}. The specific Clifford algebra employed in the Dirac equation is known today as the [[Dirac algebra]]. Although not recognized as such by Dirac at the time the equation was formulated, in hindsight the introduction of this ''[[geometric algebra]]'' represents an enormous stride forward in the development of quantum theory.

The Dirac equation may now be interpreted as an [[eigenvalue]] equation, where the rest mass is proportional to an eigenvalue of the [[4-momentum operator]], the [[proportionality constant]] being the speed of light:
<math display="block"> \operatorname{P}_\mathsf{op} \psi = m c \psi .</math>

Using <math>{\partial\!\!\!/} \mathrel{\stackrel{\mathrm{def}}{=}} \gamma^\mu \partial_\mu</math> (<math>{\partial\!\!\!\big /}</math> is pronounced "d-slash"),<ref>{{cite book |last=Pendleton |first=Brian |url=http://www2.ph.ed.ac.uk/~bjp/qt/rqt.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://www2.ph.ed.ac.uk/~bjp/qt/rqt.pdf |archive-date=2022-10-09 |url-status=live |title=Quantum Theory |year=2012–2013 |at=section&nbsp;4.3 "The Dirac Equation"}}</ref> according to Feynman slash notation, the Dirac equation becomes:
<math display="block">i \hbar {\partial\!\!\!\big /} \psi - m c \psi = 0 .</math>

In practice, physicists often use units of measure such that {{math|''ħ'' {{=}} ''c'' {{=}} 1}}, known as [[natural units]]. The equation then takes the simple form

{{Equation box 1
|title='''Dirac equation''' '' (natural units)''
|indent=:
|equation = <math>(i{\partial\!\!\!\big /} - m) \psi = 0</math>
|border
|border colour = #50C878
|background colour = #ECFCF4
}}

A foundational theorem{{which|date=September 2024}} states that if two distinct sets of matrices are given that both satisfy the [[Clifford algebra|Clifford relations]], then they are connected to each other by a [[Matrix similarity|similarity transform]]:
<math display="block">\gamma^{\mu\prime} = S^{-1} \gamma^\mu S ~.</math>

If in addition the matrices are all [[unitary transformation|unitary]], as are the Dirac set, then {{math|''S''}} itself is [[unitary matrix|unitary]];
<math display="block">\gamma^{\mu\prime} = U^\dagger \gamma^\mu U ~.</math>

The transformation {{math|''U''}} is unique up to a multiplicative factor of absolute value 1. Let us now imagine a [[Lorentz transformation]] to have been performed on the space and time coordinates, and on the derivative operators, which form a covariant vector. For the operator {{math|''γ''<sup>''μ''</sup>∂<sub>''μ''</sub>}} to remain invariant, the gammas must transform among themselves as a contravariant vector with respect to their spacetime index. These new gammas will themselves satisfy the Clifford relations, because of the orthogonality of the Lorentz transformation. By the previously mentioned foundational theorem,{{which|date=September 2024}} one may replace the new set by the old set subject to a unitary transformation. In the new frame, remembering that the rest mass is a relativistic scalar, the Dirac equation will then take the form
<math display="block">\begin{align}
  \left(iU^\dagger \gamma^\mu U\partial_\mu^\prime - m\right)\psi\left(x^\prime, t^\prime\right) &= 0 \\
              U^\dagger(i\gamma^\mu\partial_\mu^\prime - m)U \psi\left(x^\prime, t^\prime\right) &= 0 ~.
\end{align}</math>

If the transformed spinor is defined as
<math display="block">\psi^\prime = U\psi</math>
then the transformed Dirac equation is produced in a way that demonstrates [[Manifest covariance|manifest relativistic invariance]]:
<math display="block">\left(i\gamma^\mu\partial_\mu^\prime - m\right)\psi^\prime\left(x^\prime, t^\prime\right) = 0 ~.</math>

Thus, settling on any unitary representation of the gammas is final, provided the spinor is transformed according to the unitary transformation that corresponds to the given Lorentz transformation.

The various representations of the Dirac matrices employed will bring into focus particular aspects of the physical content in the Dirac wave function. The representation shown here is known as the ''standard'' representation – in it, the wave function's upper two components go over into Pauli's 2&nbsp;spinor wave function in the limit of low energies and small velocities in comparison to light.

The considerations above reveal the origin of the gammas in ''geometry'', hearkening back to Grassmann's original motivation; they represent a fixed basis of unit vectors in spacetime. Similarly, products of the gammas such as {{math|''γ''<sub>''μ''</sub>''γ''<sub>''ν''</sub>}} represent ''[[oriented surface]] elements'', and so on. With this in mind, one can find the form of the unit volume element on spacetime in terms of the gammas as follows. By definition, it is
<math display="block">V = \frac{1}{4!}\epsilon_{\mu\nu\alpha\beta}\gamma^\mu\gamma^\nu\gamma^\alpha\gamma^\beta .</math>

For this to be an invariant, the [[Levi-Civita symbol|epsilon symbol]] must be a [[tensor]], and so must contain a factor of {{math|{{sqrt|''g''}}}}, where {{math|''g''}} is the [[determinant]] of the [[metric tensor]]. Since this is negative, that factor is ''imaginary''. Thus
<math display="block">V = i \gamma^0\gamma^1\gamma^2\gamma^3 .</math>

This matrix is given the special symbol {{math|''γ''<sup>5</sup>}}, owing to its importance when one is considering improper transformations of space-time, that is, those that change the orientation of the basis vectors. In the standard representation, it is
<math display="block">\gamma_5 = \begin{pmatrix} 0 & I_{2} \\ I_{2} & 0 \end{pmatrix}.</math>

This matrix will also be found to anticommute with the other four Dirac matrices:
<math display="block">\gamma^5 \gamma^\mu + \gamma^\mu \gamma^5 = 0</math>

It takes a leading role when questions of ''[[parity (physics)|parity]]'' arise because the volume element as a directed magnitude changes sign under a space-time reflection. Taking the positive square root above thus amounts to choosing a handedness convention on spacetime.