Editing Base64 (section)

==Examples==
The example below uses [[ASCII]] text for simplicity, but this is not a typical use case, as it can already be safely transferred across all systems that can handle Base64. The more typical use is to encode [[binary data]] (such as an image); the resulting Base64 data will only contain 64 different ASCII characters, all of which can reliably be transferred across systems that may corrupt the raw source bytes.

Here is a well-known [[idiom]] from [[distributed computing]]:

{{Quote box
 | align = none
 | style = margin:1em 0;
 | border = 2px
 | fontsize = 800
 | quote = Many hands make light work.
}}

When the quote (without trailing whitespace) is encoded into Base64, it is represented as a byte sequence of 8-bit-padded [[ASCII]] characters encoded in [[MIME]]'s Base64 scheme as follows (newlines and white spaces may be present anywhere but are to be ignored on decoding):

{{Quote box
 | align = none
 | style = margin:1em 0;
 | border = 2px
 | fontsize = 800
 | quote={{mono|1=TWFueSBoYW5kcyBtYWtlIGxpZ2h0IHdvcmsu}}
}}

In the above quote, the encoded value of ''Man'' is ''TWFu''. Encoded in ASCII, the characters ''M'', ''a'', and ''n'' are stored as the byte values <code>77</code>, <code>97</code>, and <code>110</code>, which are the 8-bit binary values <code>01001101</code>, <code>01100001</code>, and <code>01101110</code>. These three values are joined together into a 24-bit string, producing <code>010011010110000101101110</code>. Groups of 6 bits (6 bits have a maximum of 2<sup>6</sup>&nbsp;=&nbsp;64 different binary values) are [[Binary number#Counting in binary|converted into individual numbers]] from start to end (in this case, there are four numbers in a 24-bit string), which are then converted into their corresponding Base64 character values.

As this example illustrates, Base64 encoding converts three [[octet (computing)|octets]] into four encoded characters.

{| class="wikitable" style="text-align:center;"
|+ Encoding of the source string ⟨Man⟩ in Base64
|- style="font-weight:bold;"
! rowspan=2 scope="row" | Source <br/>ASCII text
! scope="row" | Character
| colspan="8" | M
| colspan="8" | a
| colspan="8" | n
|-
! scope="row" | Octets
| colspan="8" | 77 (0x4d)
| colspan="8" | 97 (0x61)
| colspan="8" | 110 (0x6e)
|-
! colspan=2 scope="row" | Bits
| 0 || 1 || 0 || 0 || 1 || 1 || 0 || 1
| 0 || 1 || 1 || 0 || 0 || 0 || 0 || 1
| 0 || 1 || 1 || 0 || 1 || 1 || 1 || 0
|-
! rowspan=3 scope="row" | Base64<br/>encoded
! scope="row" | Sextets
| colspan="6" | 19
| colspan="6" | 22
| colspan="6" | 5
| colspan="6" | 46
|- style="font-weight:bold;"
! scope="row" | Character
| colspan="6" | T
| colspan="6" | W
| colspan="6" | F
| colspan="6" | u
|-
! scope="row" | Octets
| colspan="6" | 84 (0x54)
| colspan="6" | 87 (0x57)
| colspan="6" | 70 (0x46)
| colspan="6" | 117 (0x75)
|}

<code>=</code> padding characters might be added to make the last encoded block contain four Base64 characters.

[[Hexadecimal]] to [[octal]] transformation is useful to convert between binary and Base64. Such conversion is available for both advanced calculators and programming languages. For example, the hexadecimal representation of the 24 bits above is 4D616E. The octal representation is 23260556. Those 8 octal digits can be split into pairs ({{nowrap|23 26 05 56}}), and each pair is converted to decimal to yield {{nowrap|19 22 05 46}}. Using those four decimal numbers as indices for the Base64 alphabet, the corresponding ASCII characters are ''TWFu''.

If there are only two significant input octets (e.g., 'Ma'), or when the last input group contains only two octets, all 16 bits will be captured in the first three Base64 digits (18 bits); the two [[least significant bit]]s of the last content-bearing 6-bit block will turn out to be zero, and discarded on decoding (along with the succeeding <code>=</code> padding character):

{|class="wikitable" style="text-align:center;"
|- style="font-weight:bold;"
! rowspan=2 scope="row"   | Source <br/>ASCII text
! scope="row"             | Character
| colspan="8"             | M
| colspan="8"             | a
| colspan="8" rowspan="2"   {{n/a|}}
|-
! scope="row" | Octets
| colspan="8" | 77 (0x4d)
| colspan="8" | 97 (0x61)
|-
! colspan=2 scope="row" | Bits
| 0 || 1 || 0 || 0 || 1 || 1
| 0 || 1 || 0 || 1 || 1 || 0
| 0 || 0 || 0 || 1
| style="background-color:lightblue;" | 0
| style="background-color:lightblue;" | 0
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
|-
! rowspan=3 scope="row" | Base64<br/>encoded
! scope="row" | Sextets
| colspan="6" | 19
| colspan="6" | 22
| colspan="6" | 4
| colspan="6"   {{n/a|Padding}}
|- style="font-weight:bold;"
! scope="row" | Character
| colspan="6" | T
| colspan="6" | W
| colspan="6" | E
| colspan="6" | =
|-
! scope="row" | Octets
| colspan="6" | 84 (0x54)
| colspan="6" | 87 (0x57)
| colspan="6" | 69 (0x45)
| colspan="6" | 61 (0x3D)
|}

If there is only one significant input octet (e.g., 'M'), or when the last input group contains only one octet, all 8 bits will be captured in the first two Base64 digits (12 bits); the four [[least significant bit]]s of the last content-bearing 6-bit block will turn out to be zero, and discarded on decoding (along with the succeeding two <code>=</code> padding characters):

{| class="wikitable" style="text-align:center;"
|- style="font-weight:bold;"
! rowspan=2 scope="row"    | Source <br/>ASCII text
! scope="row"              | Character
| colspan="8"              | M
| colspan="16" rowspan="2"   {{n/a|}}
|-
! scope="row" | Octets
| colspan="8" | 77 (0x4d)
|-
! colspan=2 scope="row" | Bits
| 0 || 1 || 0 || 0 || 1 || 1
| 0 || 1
| style="background-color:lightblue;" | 0
| style="background-color:lightblue;" | 0
| style="background-color:lightblue;" | 0
| style="background-color:lightblue;" | 0

| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}

| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
| {{n/a|{{fsp}}}}
|-
! rowspan=3 scope="row" | Base64 <br/>encoded
! scope="row" | Sextets
| colspan="6" | 19
| colspan="6" | 16
| colspan="6"   {{n/a|Padding}}
| colspan="6"   {{n/a|Padding}}
|- style="font-weight:bold;"
! scope="row" | Character
| colspan="6" | T
| colspan="6" | Q
| colspan="6" | =
| colspan="6" | =
|-
! scope="row" | Octets
| colspan="6" | 84 (0x54)
| colspan="6" | 81 (0x51)
| colspan="6" | 61 (0x3D)
| colspan="6" | 61 (0x3D)
|}

===Output padding===
Because Base64 is a six-bit encoding, and because the decoded values are divided into 8-bit octets, every four characters of Base64-encoded text (4 sextets = {{times|4|6}} = 24 bits) represents three octets of unencoded text or data (3 octets = {{times|3|8}} = 24 bits). This means that when the length of the unencoded input is not a multiple of three, the encoded output must have padding added so that its length is a multiple of four. The padding character is <code>=</code>, which indicates that no further bits are needed to fully encode the input. (This is different from <code>A</code>, which means that the remaining bits are all zeros.) The example below illustrates how truncating the input of the above quote changes the output padding:

<!-- This is the encoding of **THE WHOLE** of the above passage and the ending fits in with both the above encoding and the first line of the following example, verified using
http://www.motobit.com/util/base64-decoder-encoder.asp
In the previous version, the example started with a space, which was not visible and thus quite misleading. -->
{|class="wikitable"
! scope="col" colspan=2 | Input
! scope="col" colspan=2 | Output
! scope="col" rowspan=2 | Padding
|-
! scope="col" | Text
! scope="col" | Length
! scope="col" | Text
! scope="col" | Length
|-
| ''light {{bg|lightgrey|wor}}{{bg|#cef2e0|k.}}'' || 11
| {{mono|1=bGlnaHQg{{bg|lightgrey|d29y}}{{bg|#cef2e0|2=ay4=}}}} || 16
| 1
|-
| ''light {{bg|lightgrey|wor}}{{bg|#cef2e0|k}}'' || 10
| {{mono|1=bGlnaHQg{{bg|lightgrey|d29y}}{{bg|#cef2e0|2=aw==}}}} || 16
| 2
|-
| ''light {{bg|lightgrey|wor}}'' || 9
| {{mono|1=bGlnaHQg{{bg|lightgrey|d29y}}}} || 12
| 0
|- 
| ''light {{bg|lightgrey|wo}}'' || 8
| {{mono|1=bGlnaHQg{{bg|lightgrey|2=d28=}}}} || 12
| 1
|-
| ''light {{bg|lightgrey|w}}'' || 7
| {{mono|1=bGlnaHQg{{bg|lightgrey|2=dw==}}}} || 12
| 2
|}

The padding character is not essential for decoding, since the number of missing bytes can be inferred from the length of the encoded text. In some implementations, the padding character is mandatory, while for others it is not used. An exception in which padding characters are required is when multiple Base64 encoded files have been concatenated.

===Decoding Base64 with padding===
When decoding Base64 text, four characters are typically converted back to three bytes. The only exceptions are when padding characters exist. A single <code>=</code> indicates that the four characters will decode to only two bytes, while <code>==</code> indicates that the four characters will decode to only a single byte. For example:

{| class="wikitable"
! Encoded !! Padding !! Length !! Decoded
|-
| {{mono|1=bGlnaHQg{{bg|lightgrey|2=dw==}}}}
| <code>==</code> || 1
| ''light {{bg|lightgrey|w}}''
|-
| {{mono|1=bGlnaHQg{{bg|lightgrey|2=d28=}}}}
| <code>=</code> || 2
| ''light {{bg|lightgrey|wo}}''
|-
| {{mono|1=bGlnaHQg{{bg|lightgrey|d29y}}}}
| {{CNone|None}} || 3
| ''light {{bg|lightgrey|wor}}''
|}
Another way to interpret the padding character is to consider it as an instruction to discard 2 trailing bits from the bit string each time a <code>=</code> is encountered. For example, when `{{mono|1=bGlnaHQg{{bg|lightgrey|2=dw==}}}}` is decoded, we convert each character (except the trailing occurrences of <code>=</code>) into their corresponding 6-bit representation, and then discard 2 trailing bits for the first <code>=</code> and another 2 trailing bits for the other <code>=</code>. In this instance, we would get 6 bits from the <code>d</code>, and another 6 bits from the <code>w</code> for a bit string of length 12, but since we remove 2 bits for each <code>=</code> (for a total of 4 bits), the <code>dw==</code> ends up producing 8 bits (1 byte) when decoded.

===Decoding Base64 without padding===
Without padding, after normal decoding of four characters to three bytes over and over again, fewer than four encoded characters may remain. In this situation, only two or three characters can remain. A single remaining encoded character is not possible, because a single Base64 character only contains 6 bits, and 8 bits are required to create a byte, so a minimum of two Base64 characters are required: The first character contributes 6 bits, and the second character contributes its first 2 bits. For example:

{| class="wikitable"
! Length !! Encoded !! Length !! Decoded
|-
| 2 || {{mono|1=bGlnaHQg{{bg|lightgrey|dw}}}}
| 1 || ''light {{bg|lightgrey|w}}''
|-
| 3 || {{mono|1=bGlnaHQg{{bg|lightgrey|d28}}}}
| 2 || ''light {{bg|lightgrey|wo}}''
|-
| 4 || {{mono|1=bGlnaHQg{{bg|lightgrey|d29y}}}}
| 3 || ''light {{bg|lightgrey|wor}}''
|}

Decoding without padding is not performed consistently among decoders. In addition, allowing padless decoding by definition allows multiple strings to decode into the same set of bytes, which can be a security risk.<ref>{{cite conference |last1=Chalkias |first1=Konstantinos |last2=Chatzigiannis |first2=Panagiotis |title=Base64 Malleability in Practice |conference=ASIA CCS '22: 2022 ACM on Asia Conference on Computer and Communications Security |date=30 May 2022 |pages=1219–1221 |doi=10.1145/3488932.3527284 |url=https://eprint.iacr.org/2022/361.pdf}}</ref>