Editing Rate–distortion theory (section)

== Introduction ==
[[File:Rate distortion theory problem setup.svg|thumb|512px|right|Rate distortion encoder and decoder. An encoder <math>f_n</math> encodes a sequence <math>X^n</math>. The encoded sequence <math>Y^n</math> is then fed to a decoder <math>g_n</math> which outputs a sequence <math>\hat{X}^n</math>. We try to minimize the distortion between the original sequence <math>X^n</math> and the reconstructed sequence <math>\hat{X}^n</math>.]]

Rate–distortion theory gives an analytical expression for how much compression can be achieved using lossy compression methods. Many of the existing audio, speech, image, and video compression techniques have transforms, quantization, and bit-rate allocation procedures that capitalize on the general shape of rate–distortion functions.

Rate–distortion theory was created by [[Claude Shannon]] in his foundational work on information theory.

In rate–distortion theory, the ''rate'' is usually understood as the number of [[bit]]s per data sample to be stored or transmitted. The notion of ''distortion'' is a subject of on-going discussion.<ref>{{cite conference |last=Blau |first=Y. |last2=Michaeli |first2=T. |url=http://proceedings.mlr.press/v97/blau19a/blau19a.pdf |title=Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff |book-title=Proceedings of the International Conference on Machine Learning |date=2019 |publisher=PMLR |pages=675–685 |arxiv=1901.07821}}</ref> In the most simple case (which is actually used in most cases), the distortion is defined as the expected value of the square of the difference between input and output signal (i.e., the [[mean squared error]]). However, since we know that most [[lossy compression]] techniques operate on data that will be perceived by human consumers (listening to [[music]], watching pictures and video) the distortion measure should preferably be modeled on human [[perception]] and perhaps [[aesthetics]]:  much like the use of [[probability]] in [[lossless compression]], distortion measures can ultimately be identified with [[loss function]]s as used in Bayesian [[estimation theory|estimation]] and [[decision theory]]. In audio compression, perceptual models (and therefore perceptual distortion measures) are relatively well developed and routinely used in compression techniques such as [[MP3]] or [[Vorbis]], but are often not easy to include in rate–distortion theory. In image and video compression, the human perception models are less well developed and inclusion is mostly limited to the [[JPEG]] and [[MPEG]] weighting ([[quantization (signal processing)|quantization]], [[Normalization (image processing)|normalization]]) matrix.