Editing Discrete cosine transform (section)

== Formal definition ==
Formally, the discrete cosine transform is a [[linear]], invertible [[function (mathematics)|function]] <math> f : \R^{N} \to \R^{N} </math> (where <math> \R</math> denotes the set of [[real number]]s), or equivalently an invertible {{mvar|N}} × {{mvar|N}} [[square matrix]]. There are several variants of the DCT with slightly modified definitions. The {{mvar|N}} real numbers <math>~ x_0,\ \ldots\ x_{N - 1} ~</math> are transformed into the {{mvar|N}} real numbers <math> X_0,\, \ldots,\, X_{N - 1} </math> according to one of the formulas:

=== DCT-I ===
:<math>X_k
 = \frac{1}{2} (x_0 + (-1)^k x_{N-1}) 
 + \sum_{n=1}^{N-2} x_n \cos \left[\, \tfrac{\ \pi}{\,N-1\,} \, n \, k \,\right]
 \qquad \text{ for } ~ k = 0,\ \ldots\ N-1 ~.</math>

Some authors further multiply the <math>x_0 </math> and <math> x_{N-1} </math> terms by <math> \sqrt{2\,}\,</math> and correspondingly multiply the <math>X_0</math> and <math>X_{N-1}</math> terms by <math>1/\sqrt{2\,} \,</math> which, if one further multiplies by an overall scale factor of <math display="inline">\sqrt{\tfrac{2}{N-1\,}\,}</math>, makes the DCT-I matrix [[orthogonal matrix|orthogonal]] but breaks the direct correspondence with a real-even [[Discrete Fourier transform|DFT]].

The DCT-I is exactly equivalent (up to an overall scale factor of 2), to a DFT of <math> 2(N-1) </math> real numbers with even symmetry. For example, a DCT-I of <math>N = 5 </math> real numbers <math> a\ b\ c\ d\ e </math> is exactly equivalent to a DFT of eight real numbers {{not a typo|<math> a\ b\ c\ d\ e\ d\ c\ b </math>}} (even symmetry), divided by two. (In contrast, DCT types II-IV involve a half-sample shift in the equivalent DFT.)

Note, however, that the DCT-I is not defined for <math>N</math> less than 2, while all other DCT types are defined for any positive <math>N</math>.

Thus, the DCT-I corresponds to the boundary conditions: <math>x_n</math> is even around <math>n = 0</math> and even around <math>n = N - 1</math>; similarly for <math>X_k</math>.

=== DCT-II ===
:<math>X_k =
 \sum_{n=0}^{N-1} x_n \cos \left[\, \tfrac{\,\pi\,}{N} \left( n + \tfrac{1}{2} \right) k \, \right]
 \qquad \text{ for } ~ k = 0,\ \dots\ N-1 ~.</math>

The DCT-II is probably the most commonly used form, and is often simply referred to as the ''DCT''.<ref name="pubDCT"/><ref name="pubRaoYip"/>

This transform is exactly equivalent (up to an overall scale factor of 2) to a DFT of <math>4N</math> real inputs of even symmetry, where the even-indexed elements are zero. That is, it is half of the DFT of the <math>4N</math> inputs <math> y_n ,</math> where <math> y_{2n} = 0</math>, <math> y_{2n+1} = x_n </math> for <math>0 \leq n < N</math>, <math>y_{2N} = 0</math>, and <math>y_{4N-n} = y_n</math> for <math>0 < n < 2N</math>. DCT-II transformation is also possible using <math>2N</math> signal followed by a multiplication by half shift. This is demonstrated by [[John Makhoul|Makhoul]].{{cn|date=April 2025}}

Some authors further multiply the <math>X_0</math> term by <math>1/\sqrt{N\,} \,</math> and multiply the rest of the matrix by an overall scale factor of <math display="inline">\sqrt{{2}/{N}}</math> (see below for the corresponding change in DCT-III). This makes the DCT-II matrix [[orthogonal matrix|orthogonal]], but breaks the direct correspondence with a real-even DFT of half-shifted input. This is the normalization used by [[Matlab]].<ref>{{cite web |url=https://www.mathworks.com/help/signal/ref/dct.html |title=Discrete cosine transform - MATLAB dct |website=www.mathworks.com |access-date=2019-07-11}}</ref> In many applications, such as [[JPEG]], the scaling is arbitrary because scale factors can be combined with a subsequent computational step (e.g. the [[Quantization (signal processing)|quantization]] step in JPEG<ref>{{cite book |isbn=9780442012724 |title=JPEG: Still Image Data Compression Standard |last1=Pennebaker |first1=William B. |last2=Mitchell |first2=Joan L. |date=31 December 1992|publisher=Springer }}</ref>), and a scaling can be chosen that allows the DCT to be computed with fewer multiplications.<ref>{{cite journal |url=https://search.ieice.org/bin/summary.php?id=e71-e_11_1095 |first1=Y. |last1=Arai |first2=T. |last2=Agui |first3=M. |last3=Nakajima |title=A fast DCT-SQ scheme for images |journal=IEICE Transactions |volume=71 |issue=11 |pages= 1095–1097 |year=1988}}</ref><ref>{{cite journal |doi=10.1016/j.sigpro.2008.01.004 |title=Type-II/III DCT/DST algorithms with reduced number of arithmetic operations |year=2008 |last1=Shao |first1=Xuancheng |last2=Johnson |first2=Steven G. |journal=Signal Processing |volume=88 |issue=6 |pages=1553–1564 |arxiv=cs/0703150 |bibcode=2008SigPr..88.1553S |s2cid=986733}}</ref>

The DCT-II implies the boundary conditions: <math>x_n</math> is even around <math>n = -1/2</math> and even around <math>n = N - 1/2 \,</math>; <math> X_k </math> is even around <math>k = 0</math> and odd around <math>k = N</math>.<!--[[User:Kvng/RTH]]-->

=== DCT-III ===
:<math> X_k =
  \tfrac{1}{2} x_0 +
  \sum_{n=1}^{N-1} x_n \cos \left[\, \tfrac{\,\pi\,}{N} \left( k + \tfrac{1}{2} \right) n \,\right]
  \qquad \text{ for } ~ k = 0,\ \ldots\ N-1 ~.</math>

Because it is the inverse of DCT-II up to a scale factor (see below), this form is sometimes simply referred to as "the inverse DCT" ("IDCT").<ref name="pubRaoYip"/>

Some authors divide the <math>x_0</math> term by <math>\sqrt{2}</math> instead of by 2 (resulting in an overall <math>x_0/\sqrt{2}</math> term) and multiply the resulting matrix by an overall scale factor of <math display="inline"> \sqrt{2/N}</math> (see above for the corresponding change in DCT-II), so that the DCT-II and DCT-III are transposes of one another. This makes the DCT-III matrix [[orthogonal matrix|orthogonal]], but breaks the direct correspondence with a real-even DFT of half-shifted output.

The DCT-III implies the boundary conditions: <math>x_n</math> is even around <math>n = 0</math> and odd around <math>n = N ;</math> <math>X_k</math> is even around <math>k = -1/2</math> and even around <math>k = N - 1/2.</math>

=== DCT-IV ===
:<math>X_k =
 \sum_{n=0}^{N-1} x_n \cos \left[\, \tfrac{\,\pi\,}{N} \, \left(n + \tfrac{1}{2} \right)\left(k + \tfrac{1}{2} \right) \,\right]
 \qquad \text{ for } k = 0,\ \ldots\ N-1 ~.</math>

The DCT-IV matrix becomes [[orthogonal matrix|orthogonal]] (and thus, being clearly symmetric, its own inverse) if one further multiplies by an overall scale factor of <math display="inline"> \sqrt{2/N}.</math>

A variant of the DCT-IV, where data from different transforms are ''overlapped'', is called the [[modified discrete cosine transform]] (MDCT).<ref>{{harvnb|Malvar|1992}}</ref>

The DCT-IV implies the boundary conditions: <math> x_n </math> is even around <math>n = -1/2</math> and odd around <math>n = N - 1/2;</math> similarly for <math>X_k.</math>

=== DCT V-VIII ===
DCTs of types I–IV treat both boundaries consistently regarding the point of symmetry: they are even/odd around either a data point for both boundaries or halfway between two data points for both boundaries. By contrast, DCTs of types V-VIII imply boundaries that are even/odd around a data point for one boundary and halfway between two data points for the other boundary.

In other words, DCT types I–IV are equivalent to real-even DFTs of even order (regardless of whether <math> N </math> is even or odd), since the corresponding DFT is of length <math> 2(N-1) </math> (for DCT-I) or <math> 4 N </math> (for DCT-II & III) or <math> 8 N </math> (for DCT-IV). The four additional types of discrete cosine transform<ref>{{harvnb|Martucci|1994}}</ref> correspond essentially to real-even DFTs of logically odd order, which have factors of <math> N \pm {1}/{2} </math> in the denominators of the cosine arguments.

However, these variants seem to be rarely used in practice. One reason, perhaps, is that [[Fast Fourier transform|FFT]] algorithms for odd-length DFTs are generally more complicated than [[Fast Fourier transform|FFT]] algorithms for even-length DFTs (e.g. the simplest radix-2 algorithms are only for even lengths), and this increased intricacy carries over to the DCTs as described below.

(The trivial real-even array, a length-one DFT (odd length) of a single number {{mvar|a}}&nbsp;, corresponds to a DCT-V of length <math> N = 1 .</math>)