Editing Block code (section)

== The block code and its parameters ==

[[Error-correcting code]]s are used to [[reliability (computer networking)|reliably]] transmit [[digital data]] over unreliable [[communication channel]]s subject to [[channel noise]].
When a sender wants to transmit a possibly very long data stream using a block code, the sender breaks the stream up into pieces of some fixed size. Each such piece is called ''message'' and the procedure given by the block code encodes each message individually into a codeword, also called a ''block'' in the context of block codes. The sender then transmits all blocks to the receiver, who can in turn use some decoding mechanism to (hopefully) recover the original messages from the possibly corrupted received blocks.
The performance and success of the overall transmission depends on the parameters of the channel and the block code.

Formally, a block code is an [[injective]] mapping
:<math>C:\Sigma^k \to \Sigma^n</math>.
Here, <math>\Sigma</math> is a finite and nonempty [[set (mathematics)|set]] and <math>k</math> and <math>n</math> are integers. The meaning and significance of these three parameters and other parameters related to the code are described below.

=== The alphabet Σ ===
The data stream to be encoded is modeled as a [[string (computer science)|string]] over some '''alphabet''' <math>\Sigma</math>. The size <math>|\Sigma|</math> of the alphabet is often written as <math>q</math>. If <math>q=2</math>, then the block code is called a ''binary'' block code. In many applications it is useful to consider <math>q</math> to be a [[prime power]], and to identify <math>\Sigma</math> with the [[finite field]] <math>\mathbb F_q</math>.

=== The message length ''k'' ===
Messages are elements <math>m</math> of <math>\Sigma^k</math>, that is, strings of length <math>k</math>.
Hence the number <math>k</math> is called the '''message length''' or '''dimension''' of a block code.

=== The block length ''n'' ===
The '''block length''' <math>n</math> of a block code is the number of symbols in a block. Hence, the elements <math>c</math> of <math>\Sigma^n</math> are strings of length <math>n</math> and correspond to blocks that may be received by the receiver. Hence they are also called received words.
If <math>c=C(m)</math> for some message <math>m</math>, then <math>c</math> is called the codeword of <math>m</math>.

=== The rate ''R'' ===
The '''rate''' of a block code is defined as the ratio between its message length and its block length:
:<math>R=k/n</math>.
A large rate means that the amount of actual message per transmitted block is high. In this sense, the rate measures the transmission speed and the quantity <math>1-R</math> measures the overhead that occurs due to the encoding with the block code.
It is a simple [[information theory|information theoretical]] fact that the rate cannot exceed <math>1</math> since data cannot in general be losslessly compressed. Formally, this follows from the fact that the code <math>C</math> is an injective map.

=== {{anchor|Minimum distance}}The distance ''d'' ===
The '''distance''' or '''minimum distance''' {{mvar|d}} of a block code is the minimum number of positions in which any two distinct codewords differ, and the '''relative distance''' <math>\delta</math> is the fraction <math>d/n</math>.
Formally, for received words <math>c_1,c_2\in\Sigma^n</math>, let <math>\Delta(c_1,c_2)</math> denote the [[Hamming distance]] between <math>c_1</math> and <math>c_2</math>, that is, the number of positions in which <math>c_1</math> and <math>c_2</math> differ.
Then the minimum distance <math>d</math> of the code <math>C</math> is defined as
:<math>d := \min_{m_1,m_2\in\Sigma^k\atop m_1\neq m_2} \Delta[C(m_1),C(m_2)]</math>.
Since any code has to be [[injective]], any two codewords will disagree in at least one position, so the distance of any code is at least <math>1</math>. Besides, the '''distance''' equals the '''[[Hamming weight#Minimum weight|minimum weight]]''' for linear block codes because:{{cn|date=December 2024}}
:<math>\min_{m_1,m_2\in\Sigma^k\atop m_1\neq m_2} \Delta[C(m_1),C(m_2)] = \min_{m_1,m_2\in\Sigma^k\atop m_1\neq m_2} \Delta[\mathbf{0},C(m_2)-C(m_1)] = \min_{m\in\Sigma^k\atop m\neq\mathbf{0}} w[C(m)] = w_\min</math>.

A larger distance allows for more error correction and detection.
For example, if we only consider errors that may change symbols of the sent codeword but never erase or add them, then the number of errors is the number of positions in which the sent codeword and the received word differ.
A code with distance {{mvar|d}} allows the receiver to detect up to <math>d-1</math> transmission errors since changing <math>d-1</math> positions of a codeword can never accidentally yield another codeword. Furthermore, if no more than <math>(d-1)/2</math> transmission errors occur, the receiver can uniquely decode the received word to a codeword. This is because every received word has at most one codeword at distance <math>(d-1)/2</math>. If more than <math>(d-1)/2</math> transmission errors occur, the receiver cannot uniquely decode the received word in general as there might be several possible codewords. One way for the receiver to cope with this situation is to use [[list decoding]], in which the decoder outputs a list of all codewords in a certain radius.

=== Popular notation ===
The notation <math>(n,k,d)_q</math> describes a block code over an alphabet <math>\Sigma</math> of size <math>q</math>, with a block length <math>n</math>, message length <math>k</math>, and distance <math>d</math>.
If the block code is a linear block code, then the square brackets in the notation <math>[n,k,d]_q</math> are used to represent that fact.
For binary codes with <math>q=2</math>, the index is sometimes dropped.
For [[maximum distance separable code]]s, the distance is always <math>d=n-k+1</math>, but sometimes the precise distance is not known, non-trivial to prove or state, or not needed. In such cases, the <math>d</math>-component may be missing.

Sometimes, especially for non-block codes, the notation <math>(n,M,d)_q</math> is used for codes that contain <math>M</math> codewords of length <math>n</math>. For block codes with messages of length <math>k</math> over an alphabet of size <math>q</math>, this number would be <math>M=q^k</math>.