Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Lossy compression
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Transform coding == {{main|Transform coding}} Some forms of lossy compression can be thought of as an application of [[transform coding]], which is a type of data compression used for [[digital images]], [[digital audio]] [[Signal (information theory)|signals]], and [[digital video]]. The transformation is typically used to enable better (more targeted) [[quantization (signal processing)|quantization]]. Knowledge of the application is used to choose information to discard, thereby lowering its [[Bandwidth (computing)|bandwidth]]. The remaining information can then be compressed via a variety of methods. When the output is decoded, the result may not be identical to the original input, but is expected to be close enough for the purpose of the application. The most common form of lossy compression is a transform coding method, the [[discrete cosine transform]] (DCT),<ref>{{cite web |title=Data compression |url=https://www.britannica.com/technology/data-compression |website=[[Encyclopedia Britannica]] |access-date=13 August 2019 }}</ref> which was first published by [[N. Ahmed|Nasir Ahmed]], T. Natarajan and [[K. R. Rao]] in 1974.<ref name="pubDCT">{{Citation |first1=Nasir |last1=Ahmed |author1-link=N. Ahmed |first2=T. |last2=Natarajan |first3=K. R. |last3=Rao |author3-link=K. R. Rao |title=Discrete Cosine Transform |journal=IEEE Transactions on Computers |date=January 1974 |volume=C-23 |issue=1 |pages=90β93 |doi=10.1109/T-C.1974.223784|s2cid=149806273 }}</ref> DCT is the most widely used form of lossy compression, for popular [[image compression]] formats (such as [[JPEG]]),<ref>{{cite web|url=https://www.w3.org/Graphics/JPEG/itu-t81.pdf|title=T.81 β DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES β REQUIREMENTS AND GUIDELINES|date=September 1992|publisher=CCITT|access-date=12 July 2019}}</ref> [[video coding standards]] (such as [[MPEG]] and [[H.264/AVC]]) and [[audio compression (data)|audio compression]] formats (such as [[MP3]] and [[Advanced Audio Codec|AAC]]). In the case of audio data, a popular form of transform coding is [[perceptual coding]], which transforms the raw data to a domain that more accurately reflects the information content. For example, rather than expressing a sound file as the amplitude levels over time, one may express it as the frequency spectrum over time, which corresponds more accurately to human audio perception. While data reduction (compression, be it lossy or lossless) is a main goal of transform coding, it also allows other goals: one may represent data more accurately for the original amount of space<ref>βAlthough one main goal of digital audio perceptual coders is data reduction, this is not a necessary characteristic. As we shall see, perceptual coding can be used to improve the representation of digital audio through advanced bit allocation.β [http://www.noisebetweenstations.com/personal/essays/audio_on_the_internet/MaskingPaper.html Masking and Perceptual Coding], Victor Lombardi, noisebetweenstations.com</ref> β for example, in principle, if one starts with an analog or high-resolution [[digital master]], an [[MP3]] file of a given size should provide a better representation than a raw uncompressed audio in [[WAV]] or [[Audio Interchange File Format|AIFF]] file of the same size. This is because uncompressed audio can only reduce file size by lowering bit rate or depth, whereas compressing audio can reduce size while maintaining bit rate and depth. This compression becomes a selective loss of the least significant data, rather than losing data across the board. Further, a transform coding may provide a better domain for manipulating or otherwise editing the data β for example, [[Equalization (audio)|equalization]] of audio is most naturally expressed in the frequency domain (boost the bass, for instance) rather than in the raw time domain. From this point of view, perceptual encoding is not essentially about ''discarding'' data, but rather about a ''better representation'' of data. Another use is for [[backward compatibility]] and [[graceful degradation]]: in color television, encoding color via a [[Luminance (video)|luminance]]-[[chrominance]] transform domain (such as [[YUV]]) means that black-and-white sets display the luminance, while ignoring the color information. Another example is [[chroma subsampling]]: the use of [[color space]]s such as [[YIQ]], used in [[NTSC]], allow one to reduce the resolution on the components to accord with human perception β humans have highest resolution for black-and-white (luma), lower resolution for mid-spectrum colors like yellow and green, and lowest for red and blues β thus NTSC displays approximately 350 pixels of luma per [[scanline]], 150 pixels of yellow vs. green, and 50 pixels of blue vs. red, which are proportional to human sensitivity to each component.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)