Editing Zero-based numbering (section)

== Computer programming ==

=== Origin ===
[[Martin Richards (computer scientist)|Martin Richards]], creator of the [[BCPL]] language (a precursor of [[C (programming language)|C]]), designed arrays initiating at 0 as the natural position to start accessing the array contents in the language, since the value of a [[pointer (computer programming)|pointer]] ''p'' used as an address accesses the position {{nowrap|''p'' + 0}} in memory.<ref>{{cite book |url=http://cm.bell-labs.com/cm/cs/who/dmr/bcpl.pdf |title=The BCPL Reference Manual |publisher=Massachusetts Institute of Technology |author=Martin Richards |year=1967 |page=11}}</ref><ref name="mikehoye">{{cite web |url=http://exple.tive.org/blarg/2013/10/22/ |title=Cita{{not a typo|tion Nee}}ded |access-date=28 January 2014 |author=Mike Hoye}}</ref> BCPL was first compiled for the [[IBM&nbsp;7094]]; the language introduced no [[Run time (program lifecycle phase)|run-time]] [[indirection lookup]]s, so the indirection optimization provided by these arrays was done at compile time.<ref name="mikehoye"/> The optimization was nevertheless important.<ref name="mikehoye"/><ref>{{cite web |url=https://www.multicians.org/thvv/7094.html |title=The IBM 7094 and CTSS |date=1995 |access-date=28 January 2014 |author=Tom Van Vleck}}</ref>

In 1982 [[Edsger W. Dijkstra]] in his pertinent note ''Why numbering should start at zero''<ref name="dijkstra"/> argued that arrays subscripts should start at zero as the latter being the most [[natural number]]. Discussing possible designs of array ranges by enclosing them in a chained inequality, combining sharp and standard inequalities to four possibilities, demonstrating that to his conviction zero-based arrays are best represented by non-overlapping index ranges, which start at zero, alluding to [[Interval (mathematics)#Definitions|open, half-open and closed intervals]] as with the real numbers. Dijkstra's criteria for preferring this convention are in detail that it represents empty sequences in a more natural way {{nowrap|(''a'' ≤ ''i'' < ''a'' ?)}} than closed "intervals" ({{nowrap|''a'' ≤ ''i'' ≤ (''a'' − 1) ?}}), and that with half-open "intervals" of naturals, the length of a sub-sequence equals the upper minus the lower bound ({{nowrap|''a'' ≤ ''i'' < ''b''}} gives {{nowrap|(''b'' − ''a'')}} possible values for ''i'', with ''a'', ''b'', ''i'' all integers).

=== {{Anchor|OFFSET}}Usage in programming languages ===

{{See also|Comparison of programming languages (array)#Array dimensions}}

This usage follows from design choices embedded in many influential [[programming language]]s, including [[C (programming language)|C]], [[Java (programming language)|Java]], and [[Lisp programming language|Lisp]]. In these three, sequence types (C arrays, Java arrays and lists, and Lisp lists and vectors) are indexed beginning with the zero subscript. Particularly in C, where arrays are closely tied to [[pointer (computer programming)|pointer]] arithmetic, this makes for a simpler implementation: the subscript refers to an offset from the starting position of an array, so the first element has an offset of zero.

Referencing memory by an address and an offset is represented directly in [[computer hardware]] on virtually all computer architectures, so this design detail in C makes compilation easier, at the cost of some human factors. In this context using "zeroth" as an ordinal is not strictly correct, but a widespread habit in this profession. Other programming languages, such as [[Fortran]] or [[COBOL]], have array subscripts starting with one, because they were meant as [[high-level programming language]]s, and as such they had to have a correspondence to the usual [[ordinal number (linguistics)|ordinal numbers]] which predate the [[0|invention of the zero]] by a long time.

[[Pascal (programming language)|Pascal]] allows the range of an array to be of any ordinal type (including enumerated types). [[APL (programming language)|APL]] allows setting the index origin to 0 or 1 during runtime programmatically.<ref>{{cite journal |last1=Brown |first1=Jim |title=In Defense of Index Origin 0 |journal=ACM SIGAPL APL Quote Quad |date=December 1978 |volume=9 |issue=2 |page=7 |doi=10.1145/586050.586053 |s2cid=40187000}}</ref><ref>{{cite web |last1=Hui |first1=Roger |title=Is Index Origin 0 a Hindrance? |url=https://www.jsoftware.com/papers/indexorigin.htm |website=jsoftware.com |publisher=JSoftware |access-date=19 January 2015}}</ref> Some recent languages, such as [[Lua (programming language)|Lua]] and [[Visual Basic]], have adopted the same convention for the same reason.

Zero is the lowest unsigned integer value, one of the most fundamental types in programming and hardware design. In computer science, [[0 (number)|zero]] is thus often used as the base case for many kinds of numerical [[recursion]]. Proofs and other sorts of mathematical reasoning in computer science often begin with zero. For these reasons, in computer science it is not unusual to number from zero rather than one.

If an array is used to represent a cycle, it is convenient to obtain the index with a [[modulo operator|modulo function]], which can result in zero.

=== Numerical properties ===
With zero-based numbering, a range can be expressed as the half-open [[Interval (mathematics)|interval]], {{math|[0, ''n'')}}, as opposed to the closed interval, {{math|[1, ''n'']}}. Empty ranges, which often occur in algorithms, are tricky to express with a closed interval without resorting to obtuse conventions like {{math|[1, 0]}}. Because of this property, zero-based indexing potentially reduces [[Off-by-one error|off-by-one]] and [[fencepost error]]s.<ref name="dijkstra">{{cite web |url=https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html |title=Why numbering should start at zero (EWD 831) |last=Dijkstra |first=Edsger Wybe |author-link=Edsger W. Dijkstra |date=May 2, 2008 |work=E. W. Dijkstra Archive |publisher=[[University of Texas at Austin]] |access-date=2011-03-16}}</ref> On the other hand, the repeat count {{mvar|n}} is calculated in advance, making the use of counting from 0 to {{math|''n'' − 1}} (inclusive) less intuitive. Some authors prefer one-based indexing, as it corresponds more closely to how entities are indexed in other contexts.<ref>Programming Microsoft® Visual C# 2005 by Donis Marshall.</ref>

Another property of this convention is in the use of [[modular arithmetic]] as implemented in modern computers. Usually, the [[modulo operation|modulo function]] maps any integer modulo {{mvar|N}} to one of the numbers {{math|0, 1, 2, ..., ''N'' − 1}}, where {{math|''N'' ≥ 1}}. Because of this, many formulas in algorithms (such as that for calculating hash table indices) can be elegantly expressed in code using the modulo operation when array indices start at zero.

Pointer operations can also be expressed more elegantly on a zero-based index due to the underlying address/offset logic mentioned above. To illustrate, suppose {{mvar|a}} is the [[memory address]] of the first element of an array, and {{mvar|i}} is the index of the desired element. To compute the address of the desired element, if the index numbers count from 1, the desired address is computed by this expression:

: <math>a + s \times (i-1),</math>

where {{mvar|s}} is the size of each element. In contrast, if the index numbers count from 0, the expression becomes

: <math>a + s \times i.</math>

This simpler expression is more efficient to compute at [[run time (program lifecycle phase)|run time]].

However, a language wishing to index arrays from 1 could adopt the convention that every array address is represented by {{math|1=''a''′ = ''a'' – ''s''}}; that is, rather than using the address of the first array element, such a language would use the address of a fictitious element located immediately before the first actual element. The indexing expression for a 1-based index would then be

: <math>a' + s \times i.</math>

Hence, the efficiency benefit at run time of zero-based indexing is not inherent, but is an artifact of the decision to represent an array with the address of its first element rather than the address of the fictitious zeroth element. However, the address of that fictitious element could very well be the address of some other item in memory not related to the array.

Superficially, the fictitious element doesn't scale well to multidimensional arrays. Indexing multidimensional arrays from zero makes a naive (contiguous) conversion to a linear address space (systematically varying one index after the other) look simpler than when indexing from one. For instance, when mapping the three-dimensional array {{math|A[''P''][''N''][''M'']}} to a linear array {{math|L[''M⋅N⋅P'']}}, both with {{mvar|M ⋅ N ⋅ P}} elements, the index {{mvar|r}} in the linear array to access a specific element with {{math|L[''r''] {{=}} A[''z''][''y''][''x'']}} in zero-based indexing, i.e. {{math|[0 ≤ ''x'' < ''P'']}}, {{math|[0 ≤ ''y'' < ''N'']}}, {{math|[0 ≤ ''z'' < ''M'']}}, and {{math|[0 ≤ ''r'' < ''M ⋅ N ⋅ P'']}}, is calculated by 
:<math>r = z \cdot M \cdot N + y \cdot M + x.</math>
Organizing all arrays with 1-based indices ({{math|[1 ≤ ''x′'' ≤ ''P'']}}, {{math|[1 ≤ ''y′'' ≤ ''N'']}}, {{math|[1 ≤ ''z′'' ≤ ''M'']}}, {{math|[1 ≤ ''r′'' ≤ ''M ⋅ N ⋅ P'']}}), and assuming an analogous arrangement of the elements, gives 
:<math>r' = (z'-1) \cdot M \cdot N + (y'-1) \cdot M + (x'-0)</math>
to access the same element, which arguably looks more complicated. Of course, {{math|''r''′ {{=}} ''r'' + 1,}} since {{math|[''z'' {{=}} ''z''′ – 1],}} {{math|[''y'' {{=}} ''y''′ – 1],}} and {{math|[''x'' {{=}} ''x''′ – 1].}} A simple and everyday-life example is [[positional notation]], which the invention of the zero made possible. In positional notation, tens, hundreds, thousands and all other digits start with zero, only units start at one.<ref>{{cite AV media | url=https://www.khanacademy.org/math/cc-1st-grade-math/cc-1st-place-value | title= Math 1st Grade / Place Value / Number grid | quote=Youtube title: Number grid / Counting / Early Math / Khan Academy | publisher=Khan Academy | author = Sal Khan | access-date = July 28, 2018}}.</ref>

<div><ul>
<li style="display: inline-table;">
{| class="wikitable"
|+ ''Zero''-based indices
|- 
! {{diagonal split header|{{mvar|y}} | {{mvar|x}}}} !! 0 !! 1 !! 2 !! .. !! {{tmath|1=x = x' - 1}} !! .. !! 8 !! 9 
|-
! scope="row" | 0
| {{gray|0}}0 || {{gray|0}}1 || {{gray|0}}2 || || || || {{gray|0}}8 || {{gray|0}}9 
|-
! scope="row" | 1
| 10 || 11 || 12 || || || || 18 || 19 
|-
! scope="row" | 2
| 20 || 21 || 22 || || || || 28 || 29 
|-
! scope="row" | ..
|  ||  ||  || || || || ||  
|-
! scope="row" | {{tmath|1=y = y' - 1}}
|  ||  ||  || || {{tmath|1=y \cdot M + x}} || || ||  
|-
! scope="row" | ..
|  ||  ||  || || || || ||  
|-
! scope="row" | 8
| 80 || 81 || 82 || || || || 88 || 89 
|-
! scope="row" | 9
| 90 || 91 || 92 || || || || 98 || 99
|-
| colspan="9" style="text-align: center;"| The table content represents the index {{mvar|r}}.
|} </li>&emsp;
<li style="display: inline-table;">
{| class="wikitable"|
|+ ''One''-based indices
|-
! {{diagonal split header|{{mvar|y'}} | {{mvar|x'}}}} !! 1 !! 2 !! 3 !! .. !! {{tmath|1=x' = x + 1}} !! .. !! 9 !! 10
|-
! scope="row" | 1
| {{gray|0}}1 || {{gray|0}}2 || {{gray|0}}3 || || || ||  {{gray|0}}9 || 10
|-
! scope="row" | 2
| 11 || 12 || 13 || || || || 19 || 20
|-
! scope="row" | 3
|  21 || 22 || 23 || || || || 29 || 30 
|-
! scope="row" | ..
|  ||  ||  || || || || || 
|-
! scope="row" | {{tmath|1=y' = y + 1}}
|  ||  ||  || || {{tmath|1=(y'-1) \cdot M + x'}} || || || 
|-
! scope="row" | ..
|  ||  ||  || || || || || 
|-
! scope="row" | 9
| 81 || 82 || 83 || || || || 89 || 90
|-
! scope="row" | 10
| 91 || 92 || 93 || || || || 99 || 100
|-
| colspan="9" style="text-align: center;"| The table content represents the index {{mvar|r′}}.
|}</li>
</ul></div>

This situation can lead to some confusion in terminology. In a zero-based indexing scheme, the first element is "element number zero"; likewise, the twelfth element is "element number eleven". Therefore, an analogy from the ordinal numbers to the quantity of objects numbered appears; the highest index of {{mvar|n}} objects will be {{math|''n'' − 1}}, and it refers to the {{mvar|n}}th element. For this reason, the first element is sometimes referred to as the [[array data structure|zeroth]] element, in an attempt to avoid confusion.