Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Advanced Audio Coding
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Functionality== AAC is a [[wideband audio]] coding algorithm that exploits two primary coding strategies to dramatically reduce the amount of data needed to represent high-quality digital audio: * Signal components that are perceptually irrelevant are discarded. * Redundancies in the coded audio signal are eliminated. The actual encoding process consists of the following steps: * The signal is converted from time-domain to frequency-domain using forward [[modified discrete cosine transform|modified discrete cosine transform (MDCT)]]. This is done by using filter banks that take an appropriate number of time samples and convert them to frequency samples. * The frequency domain signal is quantized based on a [[psychoacoustic]] model and encoded. * Internal error correction codes are added. * The signal is stored or transmitted. * In order to prevent corrupt samples, a modern implementation of the [[Luhn mod N algorithm]] is applied to each frame.<ref>US patent application [http://www.freepatentsonline.com/y2007/0297624.html 20070297624] ''Digital audio encoding''</ref> The [[MPEG-4]] audio standard does not define a single or small set of highly efficient compression schemes but rather a complex toolbox to perform a wide range of operations from low bit rate speech coding to high-quality audio coding and music synthesis. * The [[MPEG-4]] audio coding algorithm family spans the range from low bit rate speech encoding (down to 2 kbit/s) to high-quality audio coding (at 64 kbit/s per channel and higher). * AAC offers sampling frequencies between 8 kHz and 96 kHz and any number of channels between 1 and 48. * In contrast to MP3's hybrid filter bank, AAC uses the modified discrete cosine transform ([[MDCT]]) together with the increased window lengths of 1024 or 960 points. AAC encoders can switch dynamically between a single MDCT block of length 1024 points or 8 blocks of 128 points (or between 960 points and 120 points, respectively). * If a signal change or a transient occurs, 8 shorter windows of 128/120 points each are chosen for their better temporal resolution. * By default, the longer 1024-point/960-point window is otherwise used because the increased frequency resolution allows for a more sophisticated psychoacoustic model, resulting in improved coding efficiency. ===Modular encoding=== AAC takes a modular approach to encoding. Depending on the complexity of the bitstream to be encoded, the desired performance and the acceptable output, implementers may create profiles to define which of a specific set of tools they want to use for a particular application. The MPEG-2 Part 7 standard (Advanced Audio Coding) was first published in 1997 and offers three default profiles:<ref name="iso13818-7-1997" /><ref name="iso13818-7-2004-pdf">{{cite web|url=http://jongyeob.com/moniwiki/pds/upload/13818-7.pdf |archive-url=https://web.archive.org/web/20110713115817/http://jongyeob.com/moniwiki/pds/upload/13818-7.pdf |url-status=dead |archive-date=13 July 2011 |title=ISO/IEC 13818-7, Third edition, Part 7 - Advanced Audio Coding (AAC) |publisher=[[ISO]] |page=32 |date=15 October 2004 |access-date=2009-10-19 }}</ref> * '''Low Complexity (LC)''' β the simplest and most widely used and supported * '''Main Profile (Main)''' β like the LC profile, with the addition of backwards prediction * '''[[MPEG-4 AAC-SSR|Scalable Sample Rate (SSR)]]''' a.k.a. Sample-Rate Scalable (SRS) The MPEG-4 Part 3 standard (MPEG-4 Audio) defined various new compression tools (a.k.a. [[MPEG-4 Part 3#MPEG-4 Audio Object Types|Audio Object Types]]) and their usage in brand new profiles. AAC is not used in some of the MPEG-4 Audio profiles. The MPEG-2 Part 7 AAC LC profile, AAC Main profile and AAC SSR profile are combined with Perceptual Noise Substitution and defined in the MPEG-4 Audio standard as Audio Object Types (under the name AAC LC, AAC Main and AAC SSR). These are combined with other Object Types in MPEG-4 Audio profiles.<ref name="mpeg4audio-profiles"/> Here is a list of some audio profiles defined in the MPEG-4 standard:<ref name="mpeg4audio-n7016"/><ref name="fhg-mpeg4audio">{{Cite conference|conference=109th AES Convention 2000 September 22β25 Los Angeles |url=http://www.iis.fraunhofer.de/fhg/Images/AES5270_MPEG-4_Audio_Components_on_various_Platforms_tcm278-67534.PDF |title=Implementation of MPEG-4 Audio Components on various Platforms |first1=Bernhard |last1=Grill |first2=Stefan |last2=Geyersberger |first3=Johannes |last3=Hilpert |first4=Bodo |last4=Teichmann |publisher=Fraunhofer Gesellschaft |date=July 2004 |access-date=2009-10-09 |url-status=dead |archive-url=https://web.archive.org/web/20070610222853/http://www.iis.fraunhofer.de/fhg/Images/AES5270_MPEG-4_Audio_Components_on_various_Platforms_tcm278-67534.PDF |archive-date=2007-06-10 }}</ref> {{Main|MPEG-4 Part 3#Audio Profiles|l1=MPEG-4 Part 3: Audio Profiles}} * '''Main Audio Profile''' β defined in 1999, uses most of the MPEG-4 Audio Object Types (AAC Main, AAC-LC, AAC-SSR, AAC-LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI, Main synthesis) * '''Scalable Audio Profile''' β defined in 1999, uses AAC-LC, AAC-LTP, AAC Scalable, TwinVQ, CELP, HVXC, TTSI * '''Speech Audio Profile''' β defined in 1999, uses CELP, HVXC, TTSI * '''Synthetic Audio Profile''' β defined in 1999, TTSI, Main synthesis * '''High Quality Audio Profile''' β defined in 2000, uses AAC-LC, AAC-LTP, AAC Scalable, CELP, ER-AAC-LC, ER-AAC-LTP, ER-AAC Scalable, ER-CELP * '''Low Delay Audio Profile''' β defined in 2000, uses CELP, HVXC, TTSI, ER-AAC-LD, ER-CELP, ER-HVXC * '''Low Delay AAC v2''' - defined in 2012, uses AAC-LD, AAC-ELD and AAC-ELDv2<ref>{{Cite web|url=http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=59635|title=ISO/IEC 14496-3:2009/Amd 3:2012 - Transport of unified speech and audio coding (USAC)|website=ISO|access-date=2016-08-03|url-status=live|archive-url=https://web.archive.org/web/20160308175637/http://www.iso.org/iso/home/store/catalogue_tc/catalogue_detail.htm?csnumber=59635|archive-date=2016-03-08}}</ref> * '''Mobile Audio Internetworking Profile''' β defined in 2000, uses ER-AAC-LC, ER-AAC-Scalable, ER-TwinVQ, ER-BSAC, ER-AAC-LD * '''AAC Profile''' β defined in 2003, uses AAC-LC * '''High Efficiency AAC Profile''' β defined in 2003, uses AAC-LC, SBR * '''High Efficiency AAC v2 Profile''' β defined in 2006, uses AAC-LC, SBR, PS * '''Extended High Efficiency AAC xHE-AAC''' β defined in 2012, uses [[Unified Speech and Audio Coding|USAC]] One of many improvements in MPEG-4 Audio is an Object Type called Long Term Prediction (LTP), which is an improvement of the Main profile using a forward predictor with lower computational complexity.<ref name="mpeg4audio-mpeg2audio" /> ===AAC error protection toolkit=== Applying error protection enables error correction up to a certain extent. Error correcting codes are usually applied equally to the whole payload. However, since different parts of an AAC payload show different sensitivity to transmission errors, this would not be a very efficient approach. The AAC payload can be subdivided into parts with different error sensitivities. * Independent error correcting codes can be applied to any of these parts using the Error Protection (EP) tool defined in MPEG-4 Audio standard. * This toolkit provides the error correcting capability to the most sensitive parts of the payload in order to keep the additional overhead low. * The toolkit is backwardly compatible with simpler and pre-existing AAC decoders. A great deal of the toolkit's error correction functions are based around spreading information about the audio signal more evenly in the datastream. ===Error Resilient (ER) AAC=== Error Resilience (ER) techniques can be used to make the coding scheme itself more robust against errors. For AAC, three custom-tailored methods were developed and defined in MPEG-4 Audio * '''Huffman Codeword Reordering (HCR)''' to avoid error propagation within spectral data * '''Virtual Codebooks (VCB11)''' to detect serious errors within spectral data * '''Reversible Variable Length Code (RVLC)''' to reduce error propagation within scale factor data ===AAC Low Delay=== {{Main|AAC-LD|}} The audio coding standards '''MPEG-4 Low Delay''' ([[AAC-LD]]), '''Enhanced Low Delay''' (AAC-ELD), and '''Enhanced Low Delay v2''' (AAC-ELDv2) as defined in ISO/IEC 14496-3:2009 and ISO/IEC 14496-3:2009/Amd 3 are designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. They are closely derived from the MPEG-2 Advanced Audio Coding (AAC) format.<ref>{{Cite web|url=http://www.iso.org/iso/catalogue_detail?csnumber=53943|title=ISO/IEC 14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part 3: Audio|website=ISO|access-date=2016-08-02|url-status=live|archive-url=https://web.archive.org/web/20160520232358/http://www.iso.org/iso/catalogue_detail?csnumber=53943|archive-date=2016-05-20}}</ref><ref>{{Cite web|url=http://www.iso.org/iso/catalogue_detail.htm?csnumber=59635|title=ISO/IEC 14496-3:2009/Amd 3:2012 - Transport of unified speech and audio coding (USAC)|website=ISO|access-date=2016-08-02|url-status=live|archive-url=https://web.archive.org/web/20160819222620/http://www.iso.org/iso/catalogue_detail.htm?csnumber=59635|archive-date=2016-08-19}}</ref><ref>{{Cite web|url=http://mpeg.chiariglione.org/standards/mpeg-4/audio/aac-eld-family-high-quality-communication-services|title=The AAC-ELD Family for High Quality Communication Services {{!}} MPEG|website=mpeg.chiariglione.org|access-date=2016-08-02|url-status=live|archive-url=https://web.archive.org/web/20160820105628/http://mpeg.chiariglione.org/standards/mpeg-4/audio/aac-eld-family-high-quality-communication-services|archive-date=2016-08-20}}</ref> AAC-ELD is recommended by [[GSMA]] as super-wideband voice codec in the IMS Profile for High Definition Video Conference (HDVC) Service.<ref>{{Cite book|url=http://www.gsma.com/newsroom/wp-content/uploads//IR.39-v7.0.pdf|title=IMS Profile for High Definition Video Conference (HDVC) Service|publisher=GSMA|date=24 May 2016|pages=10|url-status=live|archive-url=https://web.archive.org/web/20160818201412/http://www.gsma.com/newsroom/wp-content/uploads//IR.39-v7.0.pdf|archive-date=18 August 2016}}</ref>
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)