Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
Advanced Video Coding
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Design == === Features === [[File:H.264 block diagram with quality score.jpg|thumb|150px|Block diagram of H.264]] H.264/AVC/MPEG-4 Part 10 contains a number of new features that allow it to compress video much more efficiently than older standards and to provide more flexibility for application to a wide variety of network environments. In particular, some such key features include: * Multi-picture [[inter frame|inter-picture prediction]] including the following features: ** Using previously encoded pictures as references in a much more flexible way than in past standards, allowing up to 16 reference frames (or 32 reference fields, in the case of interlaced encoding) to be used in some cases. In profiles that support non-[[Network Abstraction Layer#Coded Video Sequences|IDR]] frames, most levels specify that sufficient buffering should be available to allow for at least 4 or 5 reference frames at maximum resolution. This is in contrast to prior standards, where the limit was typically one; or, in the case of conventional "[[Video compression picture types#Bi-directional predicted frames/slices (B-frames/slices)|B pictures]]" (B-frames), two. ** Variable block-size [[motion compensation]] (VBSMC) with block sizes as large as 16×16 and as small as 4×4, enabling precise segmentation of moving regions. The supported [[Luma (video)|luma]] prediction block sizes include 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, and 4×4, many of which can be used together in a single macroblock. Chroma prediction block sizes are correspondingly smaller when [[chroma subsampling]] is used. ** The ability to use multiple motion vectors per macroblock (one or two per partition) with a maximum of 32 in the case of a B macroblock constructed of 16 4×4 partitions. The motion vectors for each 8×8 or larger partition region can point to different reference pictures. ** The ability to use any macroblock type in [[Video compression picture types#Bi-directional predicted frames/slices (B-frames/slices)|B-frames]], including I-macroblocks, resulting in much more efficient encoding when using B-frames. This feature was notably left out from [[MPEG-4 ASP]]. ** Six-tap filtering for derivation of half-pel luma sample predictions, for sharper subpixel motion-compensation. Quarter-pixel motion is derived by linear interpolation of the halfpixel values, to save processing power. ** [[Qpel|Quarter-pixel]] precision for motion compensation, enabling precise description of the displacements of moving areas. For [[Chrominance|chroma]] the resolution is typically halved both vertically and horizontally (see [[4:2:0]]) therefore the motion compensation of chroma uses one-eighth chroma pixel grid units. ** Weighted prediction, allowing an encoder to specify the use of a scaling and offset when performing motion compensation, and providing a significant benefit in performance in special cases—such as fade-to-black, fade-in, and cross-fade transitions. This includes implicit weighted prediction for B-frames, and explicit weighted prediction for P-frames. * Spatial prediction from the edges of neighboring blocks for [[Intra-frame|"intra"]] coding, rather than the "DC"-only prediction found in MPEG-2 Part 2 and the transform coefficient prediction found in H.263v2 and MPEG-4 Part 2. This includes luma prediction block sizes of 16×16, 8×8, and 4×4 (of which only one type can be used within each [[macroblock]]). * Integer [[discrete cosine transform]] (integer DCT),<ref name="Wang">{{cite journal |last1=Wang |first1=Hanli |last2=Kwong |first2=S. |last3=Kok |first3=C. |s2cid=2060937 |title=Efficient prediction algorithm of integer DCT coefficients for H.264/AVC optimization |journal=IEEE Transactions on Circuits and Systems for Video Technology |date=2006 |volume=16 |issue=4 |pages=547–552 |doi=10.1109/TCSVT.2006.871390}}</ref><ref name="Stankovic">{{cite journal |last1=Stanković |first1=Radomir S. |last2=Astola |first2=Jaakko T. |title=Reminiscences of the Early Work in DCT: Interview with K.R. Rao |journal=Reprints from the Early Days of Information Sciences |date=2012 |volume=60 |page=17 |url=http://ticsp.cs.tut.fi/reports/ticsp-report-60-reprint-rao-corrected.pdf#page=18 |access-date=13 October 2019}}</ref><ref>{{cite book |last1=Kwon |first1=Soon-young |last2=Lee |first2=Joo-kyong |last3=Chung |first3=Ki-dong |title=Image Analysis and Processing – ICIAP 2005 |chapter=Half-Pixel Correction for MPEG-2/H.264 Transcoding |series=Lecture Notes in Computer Science |date=2005 |volume=3617 |pages=576–583 |doi=10.1007/11553595_71 |publisher=Springer Berlin Heidelberg|isbn=978-3-540-28869-5 |doi-access=free }}</ref> a type of discrete cosine transform (DCT)<ref name="Stankovic"/> where the transform is an integer approximation of the standard DCT.<ref name="Britanak2010">{{cite book |last1=Britanak |first1=Vladimir |last2=Yip |first2=Patrick C. |last3=Rao |first3=K. R. |author3-link=K. R. Rao |title=DiProperties, Fast Algorithms and Integer Approximations |date=2010 |publisher=[[Elsevier]] |isbn=9780080464640 |pages=ix, xiii, 1, 141–304 |url=https://books.google.com/books?id=iRlQHcK-r_kC&pg=PA141}}</ref> It has selectable block sizes<ref name="apple">{{cite web |last1=Thomson |first1=Gavin |last2=Shah |first2=Athar |title=Introducing HEIF and HEVC |url=https://devstreaming-cdn.apple.com/videos/wwdc/2017/503i6plfvfi7o3222/503/503_introducing_heif_and_hevc.pdf |publisher=[[Apple Inc.]] |year=2017 |access-date=5 August 2019}}</ref> and exact-match integer computation to reduce complexity, including: ** An exact-match integer 4×4 spatial block transform, allowing precise placement of [[residual frame|residual]] signals with little of the "[[ringing artifact|ringing]]" often found with prior codec designs. It is similar to the standard DCT used in previous standards, but uses a smaller block size and simple integer processing. Unlike the cosine-based formulas and tolerances expressed in earlier standards (such as H.261 and MPEG-2), integer processing provides an exactly specified decoded result. ** An exact-match integer 8×8 spatial block transform, allowing highly correlated regions to be compressed more efficiently than with the 4×4 transform. This design is based on the standard DCT, but simplified and made to provide exactly specified decoding. ** Adaptive encoder selection between the 4×4 and 8×8 transform block sizes for the integer transform operation. ** A secondary [[Hadamard transform]] performed on "DC" coefficients of the primary spatial transform applied to chroma DC coefficients (and also luma in one special case) to obtain even more compression in smooth regions. * [[Lossless]] macroblock coding features including: ** A lossless "PCM macroblock" representation mode in which video data samples are represented directly,<ref>{{cite web|url=http://www.fastvdo.com/spie04/spie04-h264OverviewPaper.pdf |title=The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions |access-date=2011-07-30}}</ref> allowing perfect representation of specific regions and allowing a strict limit to be placed on the quantity of coded data for each macroblock. ** An enhanced lossless macroblock representation mode allowing perfect representation of specific regions while ordinarily using substantially fewer bits than the PCM mode. * Flexible [[Interlaced video|interlace]]d-scan video coding features, including: ** Macroblock-adaptive frame-field (MBAFF) coding, using a macroblock pair structure for pictures coded as frames, allowing 16×16 macroblocks in field mode (compared with MPEG-2, where field mode processing in a picture that is coded as a frame results in the processing of 16×8 half-macroblocks). ** Picture-adaptive frame-field coding (PAFF or PicAFF) allowing a freely selected mixture of pictures coded either as complete frames where both fields are combined for encoding or as individual single fields. * A quantization design including: ** Logarithmic step size control for easier bit rate management by encoders and simplified inverse-quantization scaling ** Frequency-customized quantization scaling matrices selected by the encoder for perceptual-based quantization optimization * An in-loop [[Deblocking filter (video)|deblocking filter]] that helps prevent the blocking artifacts common to other DCT-based image compression techniques, resulting in better visual appearance and compression efficiency * An [[entropy encoding|entropy coding]] design including: ** [[Context-adaptive binary arithmetic coding]] (CABAC), an algorithm to losslessly compress syntax elements in the video stream knowing the probabilities of syntax elements in a given context. CABAC compresses data more efficiently than CAVLC but requires considerably more processing to decode. ** [[Context-adaptive variable-length coding]] (CAVLC), which is a lower-complexity alternative to CABAC for the coding of quantized transform coefficient values. Although lower complexity than CABAC, CAVLC is more elaborate and more efficient than the methods typically used to code coefficients in other prior designs. ** A common simple and highly structured [[Variable-length code|variable length coding]] (VLC) technique for many of the syntax elements not coded by CABAC or CAVLC, referred to as [[Exponential-Golomb coding]] (or Exp-Golomb). * Loss resilience features including: ** A [[Network Abstraction Layer]] (NAL) definition allowing the same video syntax to be used in many network environments. One very fundamental design concept of H.264 is to generate self-contained packets, to remove the header duplication as in MPEG-4's Header Extension Code (HEC).<ref name="rfc3984_3"/> This was achieved by decoupling information relevant to more than one slice from the media stream. The combination of the higher-level parameters is called a parameter set.<ref name="rfc3984_3"/> The H.264 specification includes two types of parameter sets: Sequence Parameter Set (SPS) and Picture Parameter Set (PPS). An active sequence parameter set remains unchanged throughout a coded video sequence, and an active picture parameter set remains unchanged within a coded picture. The sequence and picture parameter set structures contain information such as picture size, optional coding modes employed, and macroblock to slice group map.<ref name="rfc3984_3">RFC 3984, p.3</ref> ** [[Flexible macroblock ordering]] (FMO), also known as slice groups, and arbitrary slice ordering (ASO), which are techniques for restructuring the ordering of the representation of the fundamental regions (''macroblocks'') in pictures. Typically considered an error/loss robustness feature, FMO and ASO can also be used for other purposes. ** Data partitioning (DP), a feature providing the ability to separate more important and less important syntax elements into different packets of data, enabling the application of unequal error protection (UEP) and other types of improvement of error/loss robustness. ** Redundant slices (RS), an error/loss robustness feature that lets an encoder send an extra representation of a picture region (typically at lower fidelity) that can be used if the primary representation is corrupted or lost. ** Frame numbering, a feature that allows the creation of "sub-sequences", enabling temporal scalability by optional inclusion of extra pictures between other pictures, and the detection and concealment of losses of entire pictures, which can occur due to network packet losses or channel errors. * Switching slices, called SP and SI slices, allowing an encoder to direct a decoder to jump into an ongoing video stream for such purposes as video streaming bit rate switching and "trick mode" operation. When a decoder jumps into the middle of a video stream using the SP/SI feature, it can get an exact match to the decoded pictures at that location in the video stream despite using different pictures, or no pictures at all, as references prior to the switch. * A simple automatic process for preventing the accidental emulation of [[start code]]s, which are special sequences of bits in the coded data that allow random access into the bitstream and recovery of byte alignment in systems that can lose byte synchronization. * Supplemental enhancement information (SEI) and video usability information (VUI), which are extra information that can be inserted into the bitstream for various purposes such as indicating the color space used the video content or various constraints that apply to the encoding. SEI messages can contain arbitrary user-defined metadata payloads or other messages with syntax and semantics defined in the standard. * Auxiliary pictures, which can be used for such purposes as [[alpha compositing]]. * Support of monochrome (4:0:0), 4:2:0, 4:2:2, and 4:4:4 [[chroma sampling]] (depending on the selected profile). * Support of sample bit depth precision ranging from 8 to 14 bits per sample (depending on the selected profile). * The ability to encode individual color planes as distinct pictures with their own slice structures, macroblock modes, motion vectors, etc., allowing encoders to be designed with a simple parallelization structure (supported only in the three 4:4:4-capable profiles). * Picture order count, a feature that serves to keep the ordering of the pictures and the values of samples in the decoded pictures isolated from timing information, allowing timing information to be carried and controlled/changed separately by a system without affecting decoded picture content. These techniques, along with several others, help H.264 to perform significantly better than any prior standard under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less, especially on high bit rate and high resolution video content.<ref>{{cite web|author=Apple Inc. |url=https://www.apple.com/quicktime/technologies/h264/faq.html |title=H.264 FAQ |publisher=Apple |date=1999-03-26 |access-date=2010-05-17 |url-status=dead |archive-url=https://web.archive.org/web/20100307022217/http://www.apple.com/quicktime/technologies/h264/faq.html |archive-date=March 7, 2010 }}</ref> Like other ISO/IEC MPEG video standards, H.264/AVC has a reference software implementation that can be freely downloaded.<ref>{{cite web|author=Karsten Suehring |url=http://iphome.hhi.de/suehring/tml/download/ |title=H.264/AVC JM Reference Software Download |publisher=Iphome.hhi.de |access-date=2010-05-17}}</ref> Its main purpose is to give examples of H.264/AVC features, rather than being a useful application ''per se''. Some reference hardware design work has also been conducted in the [[Moving Picture Experts Group]]. The above-mentioned aspects include features in all profiles of H.264. A profile for a codec is a set of features of that codec identified to meet a certain set of specifications of intended applications. This means that many of the features listed are not supported in some profiles. Various profiles of H.264/AVC are discussed in next section. === Profiles === The standard defines several sets of capabilities, which are referred to as ''profiles'', targeting specific classes of applications. These are declared using a profile code (profile_idc) and sometimes a set of additional constraints applied in the encoder. The profile code and indicated constraints allow a decoder to recognize the requirements for decoding that specific bitstream. (And in many system environments, only one or two profiles are allowed to be used, so decoders in those environments do not need to be concerned with recognizing the less commonly used profiles.) By far the most commonly used profile is the High Profile. Profiles for non-scalable 2D video applications include the following: ;Constrained Baseline Profile (CBP, 66 with constraint set 1): Primarily for low-cost applications, this profile is most typically used in videoconferencing and mobile applications. It corresponds to the subset of features that are in common between the Baseline, Main, and High Profiles. ;Baseline Profile (BP, 66): Primarily for low-cost applications that require additional data loss robustness, this profile is used in some videoconferencing and mobile applications. This profile includes all features that are supported in the Constrained Baseline Profile, plus three additional features that can be used for loss robustness (or for other purposes such as low-delay multi-point video stream compositing). The importance of this profile has faded somewhat since the definition of the Constrained Baseline Profile in 2009. All Constrained Baseline Profile bitstreams are also considered to be Baseline Profile bitstreams, as these two profiles share the same profile identifier code value. ;Extended Profile (XP, 88): Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching. ;Main Profile (MP, 77): This profile is used for standard-definition digital TV broadcasts that use the MPEG-4 format as defined in the DVB standard.<ref>{{cite web|url=http://www.etsi.org/deliver/etsi_ts/101100_101199/101154/01.09.01_60/ts_101154v010901p.pdf |title=TS 101 154 – V1.9.1 – Digital Video Broadcasting (DVB); Specification for the use of Video and Audio Coding in Broadcasting Applications based on the MPEG-2 Transport Stream |access-date=2010-05-17}}</ref> It is not, however, used for high-definition television broadcasts, as the importance of this profile faded when the High Profile was developed in 2004 for that application. ;High Profile (HiP, 100): The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (for example, this is the profile adopted by the [[Blu-ray Disc]] storage format and the [[Digital Video Broadcasting|DVB]] HDTV broadcast service). ;Progressive High Profile (PHiP, 100 with constraint set 4): Similar to the High profile, but without support of field coding features. ;Constrained High Profile (100 with constraint set 4 and 5): Similar to the Progressive High profile, but without support of B (bi-predictive) slices. ;High 10 Profile (Hi10P, 110): Going beyond typical mainstream consumer product capabilities, this profile builds on top of the High Profile, adding support for up to 10 bits per sample of decoded picture precision. ;High 4{{!:}}2{{!:}}2 Profile (Hi422P, 122): Primarily targeting professional applications that use interlaced video, this profile builds on top of the High 10 Profile, adding support for the 4:2:2 [[chroma sampling]] format while using up to 10 bits per sample of decoded picture precision. ;High 4{{!:}}4{{!:}}4 Predictive Profile (Hi444PP, 244): This profile builds on top of the High 4:2:2 Profile, supporting up to 4:4:4 chroma sampling, up to 14 bits per sample, and additionally supporting efficient lossless region coding and the coding of each picture as three separate color planes. For camcorders, editing, and professional applications, the standard contains four additional [[Intra-frame]]-only profiles, which are defined as simple subsets of other corresponding profiles. These are mostly for professional (e.g., camera and editing system) applications: ;High 10 Intra Profile (110 with constraint set 3): The High 10 Profile constrained to all-Intra use. ;High 4{{!:}}2{{!:}}2 Intra Profile (122 with constraint set 3): The High 4:2:2 Profile constrained to all-Intra use. ;High 4{{!:}}4{{!:}}4 Intra Profile (244 with constraint set 3): The High 4:4:4 Profile constrained to all-Intra use. ;CAVLC 4{{!:}}4{{!:}}4 Intra Profile (44): The High 4:4:4 Profile constrained to all-Intra use and to CAVLC entropy coding (i.e., not supporting CABAC). As a result of the [[Scalable Video Coding]] (SVC) extension, the standard contains five additional ''scalable profiles'', which are defined as a combination of a H.264/AVC profile for the base layer (identified by the second word in the scalable profile name) and tools that achieve the scalable extension: ;Scalable Baseline Profile (83): Primarily targeting video conferencing, mobile, and surveillance applications, this profile builds on top of the Constrained Baseline profile to which the base layer (a subset of the bitstream) must conform. For the scalability tools, a subset of the available tools is enabled. ;Scalable Constrained Baseline Profile (83 with constraint set 5): A subset of the Scalable Baseline Profile intended primarily for real-time communication applications. ;Scalable High Profile (86): Primarily targeting broadcast and streaming applications, this profile builds on top of the H.264/AVC High Profile to which the base layer must conform. ;Scalable Constrained High Profile (86 with constraint set 5): A subset of the Scalable High Profile intended primarily for real-time communication applications. ;Scalable High Intra Profile (86 with constraint set 3): Primarily targeting production applications, this profile is the Scalable High Profile constrained to all-Intra use. As a result of the [[Multiview Video Coding]] (MVC) extension, the standard contains two ''multiview profiles'': ;Stereo High Profile (128): This profile targets two-view [[stereoscopic]] 3D video and combines the tools of the High profile with the inter-view prediction capabilities of the MVC extension. ;Multiview High Profile (118): This profile supports two or more views using both inter-picture (temporal) and MVC inter-view prediction, but does not support field pictures and macroblock-adaptive frame-field coding. The Multi-resolution Frame-Compatible (MFC) extension added two more profiles: ;MFC High Profile (134): A profile for stereoscopic coding with two-layer resolution enhancement. ;MFC Depth High Profile (135): The 3D-AVC extension added two more profiles: ;Multiview Depth High Profile (138): This profile supports joint coding of depth map and video texture information for improved compression of 3D video content. ;Enhanced Multiview Depth High Profile (139): An enhanced profile for combined multiview coding with depth information. ==== Feature support in particular profiles ==== {| class="wikitable" |- ! Feature ! title="Constrained Baseline Profile" | CBP ! title="Baseline Profile" | BP ! title="Extended Profile" | XP ! title="Main Profile" | MP ! title="Progressive High Profile" | ProHiP ! title="High Profile" | HiP ! title="High 10 Profile" | Hi10P ! title="High 4:2:2 Profile" | Hi422P ! title="High 4:4:4 Predictive Profile" | Hi444PP |- style="display:none" ! I and P slices | {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! [[Color depth|Bit depth]] (per sample) | {{Yes|8}} || {{Yes|8}} || {{Yes|8}} || {{Yes|8}} || {{Yes|8}} || {{Yes|8}} || {{yes|8 to 10}} || {{yes|8 to 10}} || {{yes|8 to 14}} |- ! [[Chroma subsampling|Chroma]] formats | {{Yes|4:2:0<br /><br /> }} || {{Yes|4:2:0<br /><br /> }} || {{Yes|4:2:0<br /><br /> }} || {{Yes|4:2:0<br /><br /> }} || {{Yes|4:2:0<br /><br /> }} || {{Yes|4:2:0<br /><br /> }} || {{Yes|4:2:0<br /><br /> }} || {{yes|4:2:0/<br />4:2:2<br /> }} || {{yes|4:2:0/<br />4:2:2/<br />4:4:4}} |- ! [[Flexible macroblock ordering|Flexible macroblock ordering (FMO)]] | {{no}} || {{yes}} || {{yes}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} |- ! [[Arbitrary slice ordering|Arbitrary slice ordering (ASO)]] | {{no}} || {{yes}} || {{yes}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} |- ! Redundant slices (RS) | {{no}} || {{yes}} || {{yes}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} |- ! Data Partitioning | {{no}} || {{no}} || {{yes}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} |- ! SI and SP slices | {{no}} || {{no}} || {{yes}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} |- ! Interlaced coding (PicAFF, MBAFF) | {{no}} || {{no}} || {{yes}} || {{yes}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! B slices | {{no}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- style="display:none" ! Multiple reference frames | {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- style="display:none" ! [[Deblocking filter (video)|In-loop deblocking filter]] | {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- style="display:none" ! [[CAVLC|CAVLC entropy coding]] | {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! [[CABAC|CABAC entropy coding]] | {{no}} || {{no}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! 4:0:0 ([[Monochrome]]) | {{no}} || {{no}} || {{no}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! 8×8 vs. 4×4 transform adaptivity | {{no}} || {{no}} || {{no}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! Quantization scaling matrices | {{no}} || {{no}} || {{no}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! Separate C<sub>B</sub> and C<sub>R</sub> QP control | {{no}} || {{no}} || {{no}} || {{no}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} || {{yes}} |- ! Separate color plane coding | {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{yes}} |- ! Predictive lossless coding | {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{no}} || {{yes}} |} === Levels === As the term is used in the standard, a "''level''" is a specified set of constraints that indicate a degree of required decoder performance for a profile. For example, a level of support within a profile specifies the maximum picture resolution, frame rate, and bit rate that a decoder may use. A decoder that conforms to a given level must be able to decode all bitstreams encoded for that level and all lower levels. <!-- Please don't change the default state of the table to collapsed using the mw-collapsed option since that causes problems with some web browsers, such as making the customtoggle in the table not work when the page is refreshed or revisited. --> {| class="wikitable" style="text-align:right;" |+ Levels with maximum property values<ref name=AVC13April2017ITURecommendations/> |- ! Level<br> ! Maximum<br>decoding speed<br>(macroblocks/s) ! Maximum<br>frame size<br />(macroblocks) ! Maximum video<br>bit rate for video<br>coding layer (VCL)<br /> (Constrained Baseline,<br>Baseline, Extended<br>and Main Profiles)<br>(kbits/s) ! Examples for high resolution<br>@ highest frame rate<br>(maximum stored frames) <div class="mw-customtoggle-H264MPEG4AVC" style="color:#0B0080; cursor: pointer; border: 1px solid #aaa; border-radius: 10px; padding: 2px; margin-bottom: 5px;">Toggle additional details</div><br> |- ! 1 | 1,485 | 99 | 64 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 128×96@30.9 (8)</div>176×144@15.0 (4) |- ! 1b | 1,485 | 99 | 128 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 128×96@30.9 (8)</div>176×144@15.0 (4) |- ! 1.1 | 3,000 | 396 | 192 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 176×144@30.3 (9)<br>320×240@10.0 (3)</div>352×288@7.5 (2) |- ! 1.2 | 6,000 | 396 | 384 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 320×240@20.0 (7)</div>352×288@15.2 (6) |- ! 1.3 | 11,880 | 396 | 768 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 320×240@36.0 (7)</div>352×288@30.0 (6) |- ! 2 | 11,880 | 396 | 2,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 320×240@36.0 (7)</div>352×288@30.0 (6) |- ! 2.1 | 19,800 | 792 | 4,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 352×480@30.0 (7)</div>352×576@25.0 (6) |- ! 2.2 | 20,250 | 1,620 | 4,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 352×480@30.7 (12)<br>352×576@25.6 (10)<br>720×480@15.0 (6)</div>720×576@12.5 (5) |- ! 3 | 40,500 | 1,620 | 10,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 352×480@61.4 (12)<br>352×576@51.1 (10)<br>720×480@30.0 (6)</div>720×576@25.0 (5) |- ! 3.1 | 108,000 | 3,600 | 14,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 720×480@80.0 (13)<br>720×576@66.7 (11)</div>1,280×720@30.0 (5) |- ! 3.2 | 216,000 | 5,120 | 20,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,280×720@60.0 (5)</div>1,280×1,024@42.2 (4) |- ! 4 | 245,760 | 8,192 | 20,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,280×720@68.3 (9)<br>1,920×1,080@30.1 (4)</div>2,048×1,024@30.0 (4) |- ! 4.1 | 245,760 | 8,192 | 50,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,280×720@68.3 (9)<br>1,920×1,080@30.1 (4)</div>2,048×1,024@30.0 (4) |- ! 4.2 | 522,240 | 8,704 | 50,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,280×720@145.1 (9)<br>1,920×1,080@64.0 (4)</div>2,048×1,080@60.0 (4) |- ! 5 | 589,824 | 22,080 | 135,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,920×1,080@72.3 (13)<br>2,048×1,024@72.0 (13)<br>2,048×1,080@67.8 (12)<br>2,560×1,920@30.7 (5)</div>3,672×1,536@26.7 (5) |- ! 5.1 | 983,040 | 36,864 | 240,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,920×1,080@120.5 (16)<br>2,560×1,920@51.2 (9)<br>3,840×2,160@31.7 (5)<br>4,096×2,048@30.0 (5)<br>4,096×2,160@28.5 (5)</div>4,096×2,304@26.7 (5) |- ! 5.2 | 2,073,600 | 36,864 | 240,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 1,920×1,080@172.0 (16)<br>2,560×1,920@108.0 (9)<br>3,840×2,160@66.8 (5)<br>4,096×2,048@63.3 (5)<br>4,096×2,160@60.0 (5)</div>4,096×2,304@56.3 (5) |- ! 6 | 4,177,920 | 139,264 | 240,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 3,840×2,160@128.9 (16)<br />7,680×4,320@32.2 (5)</div>8,192×4,320@30.2 (5) |- ! 6.1 | 8,355,840 | 139,264 | 480,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 3,840×2,160@257.9 (16)<br />7,680×4,320@64.5 (5)</div>8,192×4,320@60.4 (5) |- ! 6.2 | 16,711,680 | 139,264 | 800,000 | <div class="mw-collapsible" id="mw-customcollapsible-H264MPEG4AVC"> 3,840×2,160@300.0 (16)<br />7,680×4,320@128.9 (5)</div>8,192×4,320@120.9 (5) |} The maximum bit rate for the High Profile is 1.25 times that of the Constrained Baseline, Baseline, Extended and Main Profiles; 3 times for Hi10P, and 4 times for Hi422P/Hi444PP. The number of luma samples is 16×16=256 times the number of macroblocks (and the number of luma samples per second is 256 times the number of macroblocks per second). === Decoded picture buffering === Previously encoded pictures are used by H.264/AVC encoders to provide predictions of the values of samples in other pictures. This allows the encoder to make efficient decisions on the best way to encode a given picture. At the decoder, such pictures are stored in a virtual ''decoded picture buffer'' (DPB). The maximum capacity of the DPB, in units of frames (or pairs of fields), as shown in parentheses in the right column of the table above, can be computed as follows: : {{mono|''DpbCapacity'' {{=}} min(floor(''MaxDpbMbs'' / (''PicWidthInMbs'' * ''FrameHeightInMbs'')), 16)}} Where {{mono|''MaxDpbMbs''}} is a constant value provided in the table below as a function of level number, and {{mono|''PicWidthInMbs''}} and {{mono|''FrameHeightInMbs''}} are the picture width and frame height for the coded video data, expressed in units of macroblocks (rounded up to integer values and accounting for cropping and macroblock pairing when applicable). This formula is specified in sections A.3.1.h and A.3.2.f of the 2017 edition of the standard.<ref name=AVC13April2017ITURecommendations/> <div style="overflow-x:auto"> {| class="wikitable" style="text-align:center;width:800px;" |- ! Level | '''1''' | '''1b''' | '''1.1''' | '''1.2''' | '''1.3''' | '''2''' | '''2.1''' | '''2.2''' | '''3''' | '''3.1''' | '''3.2''' | '''4''' | '''4.1''' | '''4.2''' | '''5''' | '''5.1''' | '''5.2''' | '''6''' | '''6.1''' | '''6.2''' |- ! {{mono|MaxDpbMbs}} | 396 | 396 | 900 | 2,376 | 2,376 | 2,376 | 4,752 | 8,100 | 8,100 | 18,000 | 20,480 | 32,768 | 32,768 | 34,816 | 110,400 | 184,320 | 184,320 | 696,320 | 696,320 | 696,320 |}</div> For example, for an HDTV picture that is 1,920 samples wide ({{samp|1=PicWidthInMbs = 120}}) and 1,080 samples high ({{samp|1=FrameHeightInMbs = 68}}), a Level 4 decoder has a maximum DPB storage capacity of {{samp|floor(32768/(120*68))}} = 4 frames (or 8 fields). Thus, the value 4 is shown in parentheses in the table above in the right column of the row for Level 4 with the frame size 1920×1080. The current picture being decoded is ''not included'' in the computation of DPB fullness (unless the encoder has indicated for it to be stored for use as a reference for decoding other pictures or for delayed output timing). Thus, a decoder needs to actually have sufficient memory to handle (at least) one frame ''more'' than the maximum capacity of the DPB as calculated above.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)