Editing Modified discrete cosine transform (section)

== Relationship to DCT-IV and origin of TDAC ==

As can be seen by inspection of the definitions, for ''even'' ''N'' the MDCT is essentially equivalent to a DCT-IV, where the input is shifted by ''N''/2 and two ''N''-blocks of data are transformed at once.  By examining this equivalence more carefully, important properties like TDAC can be easily derived.

In order to define the precise relationship to the DCT-IV, one must realize that the DCT-IV corresponds to alternating even/odd boundary conditions: even at its left boundary (around ''n''&nbsp;=&nbsp;&minus;1/2), odd at its right boundary (around ''n''&nbsp;=&nbsp;''N''&nbsp;&minus;&nbsp;1/2), and so on (instead of periodic boundaries as for a [[discrete Fourier transform|DFT]]).  This follows from the identities
: <math>\cos\left[\frac{\pi}{N} \left(-n - 1 + \frac{1}{2}\right) \left(k + \frac{1}{2}\right)\right] =
 \cos\left[\frac{\pi}{N} \left(n + \frac{1}{2}\right) \left(k + \frac{1}{2}\right)\right]</math>
and
: <math>\cos\left[\frac{\pi}{N} \left(2N - n - 1 + \frac{1}{2}\right) \left(k + \frac{1}{2}\right)\right] =
 -\cos\left[\frac{\pi}{N} \left(n + \frac{1}{2}\right) \left(k + \frac{1}{2}\right)\right].</math>
Thus, if its inputs are an array ''x'' of length ''N'', we can imagine extending this array to (''x'', &minus;''x''<sub>''R''</sub>, &minus;''x'', ''x''<sub>''R''</sub>, ...) and so on, where ''x''<sub>''R''</sub> denotes ''x'' in reverse order.

Consider an MDCT with 2''N'' inputs and ''N'' outputs, where we divide the inputs into four blocks (''a'', ''b'', ''c'', ''d'') each of size ''N''/2.  If we shift these to the right by ''N''/2 (from the +''N''/2 term in the MDCT definition), then (''b'', ''c'', ''d'') extend past the end of the ''N'' DCT-IV inputs, so we must "fold" them back according to the boundary conditions described above.

: Thus, the MDCT of 2''N'' inputs (''a'', ''b'', ''c'', ''d'') is ''exactly'' equivalent to a DCT-IV of the ''N'' inputs: (&minus;''c''<sub>''R''</sub>&nbsp;&minus;&nbsp;''d'', ''a''&nbsp;&minus;&nbsp;''b''<sub>''R''</sub>), where ''R'' denotes reversal as above.

In this way, any algorithm to compute the DCT-IV can be trivially applied to the MDCT.

Similarly, the IMDCT formula above is precisely 1/2 of the DCT-IV (which is its own inverse), where the output is extended (via the boundary conditions) to a length 2''N'' and shifted back to the left by ''N''/2.  The inverse DCT-IV would simply give back the inputs (&minus;''c''<sub>''R''</sub>&nbsp;&minus;&nbsp;''d'', ''a''&nbsp;&minus;&nbsp;''b''<sub>''R''</sub>) from above.  When this is extended via the boundary conditions and shifted, one obtains
: IMDCT(MDCT(''a'', ''b'', ''c'', ''d'')) = (''a'' − ''b''<sub>''R''</sub>, ''b'' − ''a''<sub>''R''</sub>, ''c'' + ''d''<sub>''R''</sub>, ''d'' + ''c''<sub>''R''</sub>)/2.

Half of the IMDCT outputs are thus redundant, as ''b''&nbsp;&minus;&nbsp;''a''<sub>''R''</sub> = &minus;(''a''&nbsp;&minus;&nbsp;''b''<sub>''R''</sub>)<sub>''R''</sub>, and likewise for the last two terms. If we group the input into bigger blocks ''A'',''B'' of size ''N'', where ''A''&nbsp;=&nbsp;(''a'', ''b'') and ''B''&nbsp;=&nbsp;(''c'', ''d''), we can write this result in a simpler way:
: IMDCT(MDCT(''A'', ''B'')) = (''A'' − ''A''<sub>''R''</sub>, ''B'' + ''B''<sub>''R''</sub>)/2.

One can now understand how TDAC works.  Suppose that one computes the MDCT of the subsequent, 50% overlapped, 2''N'' block (''B'', ''C'').  The IMDCT will then yield, analogous to the above: (''B''&nbsp;&minus;&nbsp;''B''<sub>''R''</sub>, ''C''&nbsp;+&nbsp;''C''<sub>''R''</sub>)/2.  When this is added with the previous IMDCT result in the overlapping half, the reversed terms cancel and one obtains simply ''B'', recovering the original data.

=== Origin of TDAC ===

The origin of the term "time-domain aliasing cancellation" is now clear.  The use of input data that extend beyond the boundaries of the logical DCT-IV causes the data to be ''aliased'' in the same way that frequencies beyond the [[Nyquist frequency]] are [[aliasing|aliased]] to lower frequencies, except that this aliasing occurs in the time domain instead of the frequency domain: we cannot distinguish the contributions of
''a'' and of ''b''<sub>''R''</sub> to the MDCT of (''a'', ''b'', ''c'', ''d''), or equivalently, to
the result of
: IMDCT(MDCT(''a'', ''b'', ''c'', ''d''))= (''a'' − ''b''<sub>''R''</sub>, ''b'' − ''a''<sub>''R''</sub>, ''c'' + ''d''<sub>''R''</sub>, ''d'' + ''c''<sub>''R''</sub>)/2.
The combinations ''c'' − ''d''<sub>''R''</sub> and so on have precisely the right signs for the combinations to cancel when they are added.

For ''odd'' ''N'' (which are rarely used in practice), ''N''/2 is not an integer, so the MDCT is not simply a shift permutation of a DCT-IV.  In this case, the additional shift by half a sample means that the MDCT/IMDCT becomes equivalent to the DCT-III/II, and the analysis is analogous to the above.

=== Smoothness and discontinuities ===

We have seen above that the MDCT of 2''N'' inputs (''a'', ''b'', ''c'', ''d'') is equivalent to a DCT-IV of the ''N'' inputs (−''c''<sub>''R''</sub> − ''d'', ''a'' − ''b''<sub>''R''</sub>).
The DCT-IV is designed for the case where the function at the right boundary is odd, and therefore the values near the right boundary are close to 0. If the input signal is smooth, this is the case: the rightmost components of ''a'' and ''b''<sub>''R''</sub> are consecutive in the input sequence (''a'', ''b'', ''c'', ''d''), and therefore their difference is small.
Let us look at the middle of the interval: if we rewrite the above expression as (−''c''<sub>''R''</sub> − ''d'', ''a'' − ''b''<sub>''R''</sub>) = (−''d'', ''a'') − (''b'', ''c'')<sub>''R''</sub>, the second term, (''b'', ''c'')<sub>''R''</sub>, gives a smooth transition in the middle.
However, in the first term, (−''d'', ''a''), there is a potential discontinuity where the right end of −''d'' meets the left end of ''a''.
This is the reason for using a window function that reduces the components near the boundaries of the input sequence (''a'', ''b'', ''c'', ''d'') towards 0.

=== TDAC for the windowed MDCT ===

Above, the TDAC property was proved for the ordinary MDCT, showing that adding IMDCTs of subsequent blocks in their overlapping half recovers the original data.  The derivation of this inverse property for the windowed MDCT is only slightly more complicated.

Consider two overlapping consecutive sets of 2''N'' inputs (''A'',''B'') and (''B'',''C''), for blocks ''A'',''B'',''C'' of size ''N''.
Recall from above that when <math>(A,B)</math> and <math>(B,C)</math> are MDCTed, IMDCTed, and added in their overlapping half, we obtain <math>(B+B_R) / 2 + (B-B_R) / 2 = B</math>, the original data.

Now we suppose that we multiply ''both'' the MDCT inputs ''and'' the IMDCT outputs by a window function of length 2''N''.  As above, we assume a symmetric window function, which is therefore of the form <math>(W,W_R)</math> where ''W'' is a length-''N'' vector and ''R'' denotes reversal as before.  Then the Princen-Bradley condition can be written as <math>W^2 + W_R^2 = (1,1,\ldots)</math>, with the squares and additions performed elementwise.

Therefore, instead of MDCTing <math>(A,B)</math>, we now MDCT <math>(WA,W_R B)</math> (with all multiplications performed elementwise).  When this is IMDCTed and multiplied again (elementwise) by the window function, the last-''N'' half becomes:
:<math>W_R \cdot (W_R B+(W_R B)_R) =W_R \cdot (W_R B+W B_R) = W_R^2 B+WW_R B_R</math>.
(Note that we no longer have the multiplication by 1/2, because the IMDCT normalization differs by a factor of 2 in the windowed case.)

Similarly, the windowed MDCT and IMDCT of <math>(B,C)</math> yields, in its first-''N'' half:
:<math>W \cdot (WB - W_R B_R) = W^2 B - W W_R B_R</math>.

When we add these two halves together, we obtain:

:<math>(W_R^2 B+WW_R B_R) + (W^2 B - W W_R B_R)= \left(W_R^2 + W^2\right)B = B,</math>

recovering the original data.