Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Rate–distortion theory
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Introduction == [[File:Rate distortion theory problem setup.svg|thumb|512px|right|Rate distortion encoder and decoder. An encoder <math>f_n</math> encodes a sequence <math>X^n</math>. The encoded sequence <math>Y^n</math> is then fed to a decoder <math>g_n</math> which outputs a sequence <math>\hat{X}^n</math>. We try to minimize the distortion between the original sequence <math>X^n</math> and the reconstructed sequence <math>\hat{X}^n</math>.]] Rate–distortion theory gives an analytical expression for how much compression can be achieved using lossy compression methods. Many of the existing audio, speech, image, and video compression techniques have transforms, quantization, and bit-rate allocation procedures that capitalize on the general shape of rate–distortion functions. Rate–distortion theory was created by [[Claude Shannon]] in his foundational work on information theory. In rate–distortion theory, the ''rate'' is usually understood as the number of [[bit]]s per data sample to be stored or transmitted. The notion of ''distortion'' is a subject of on-going discussion.<ref>{{cite conference |last=Blau |first=Y. |last2=Michaeli |first2=T. |url=http://proceedings.mlr.press/v97/blau19a/blau19a.pdf |title=Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff |book-title=Proceedings of the International Conference on Machine Learning |date=2019 |publisher=PMLR |pages=675–685 |arxiv=1901.07821}}</ref> In the most simple case (which is actually used in most cases), the distortion is defined as the expected value of the square of the difference between input and output signal (i.e., the [[mean squared error]]). However, since we know that most [[lossy compression]] techniques operate on data that will be perceived by human consumers (listening to [[music]], watching pictures and video) the distortion measure should preferably be modeled on human [[perception]] and perhaps [[aesthetics]]: much like the use of [[probability]] in [[lossless compression]], distortion measures can ultimately be identified with [[loss function]]s as used in Bayesian [[estimation theory|estimation]] and [[decision theory]]. In audio compression, perceptual models (and therefore perceptual distortion measures) are relatively well developed and routinely used in compression techniques such as [[MP3]] or [[Vorbis]], but are often not easy to include in rate–distortion theory. In image and video compression, the human perception models are less well developed and inclusion is mostly limited to the [[JPEG]] and [[MPEG]] weighting ([[quantization (signal processing)|quantization]], [[Normalization (image processing)|normalization]]) matrix.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)