Editing Pointer (computer programming) (section)

==Uses==
Pointers are directly supported without restrictions in languages such as [[PL/I]], [[C (programming language)|C]], [[C++]], [[Pascal (programming language)|Pascal]], [[FreeBASIC]], and implicitly in most [[assembly language]]s. They are used mainly to construct [[Reference (computer science)|references]], which in turn are fundamental to construct nearly all [[data structure]]s, and to pass data between different parts of a program.

In [[functional programming]] languages that rely heavily on lists, data references are managed abstractly by using primitive constructs like [[cons]] and the corresponding elements [[car and cdr]], which can be thought of as specialised pointers to the first and second components of a cons-cell. This gives rise to some of the idiomatic "flavour" of functional programming. By structuring data in such [[linked list|cons-lists]], these languages facilitate [[Recursion (computer science)|recursive]] means for building and processing data—for example, by recursively accessing the head and tail elements of lists of lists; e.g. "taking the car of the cdr of the cdr". By contrast, memory management based on pointer dereferencing in some approximation of an [[Array data structure|array]] of memory addresses facilitates treating variables as slots into which data can be assigned [[Imperative programming|imperatively]].

When dealing with arrays, the critical [[lookup table|lookup]] operation typically involves a stage called ''address calculation'' which involves constructing a pointer to the desired data element in the array. In other data structures, such as [[linked list]]s, pointers are used as references to explicitly tie one piece of the structure to another.

Pointers are used to pass parameters by reference. This is useful if the programmer wants a function's modifications to a parameter to be visible to the function's caller. This is also useful for returning multiple values from a function.

Pointers can also be used to [[Memory allocation|allocate]] and deallocate dynamic variables and arrays in memory. Since a variable will often become redundant after it has served its purpose, it is a waste of memory to keep it, and therefore it is good practice to deallocate it (using the original pointer reference) when it is no longer needed. Failure to do so may result in a ''[[memory leak]]'' (where available free memory gradually, or in severe cases rapidly, diminishes because of an accumulation of numerous redundant memory blocks).

===C pointers===
The basic [[Syntax (programming languages)|syntax]] to define a pointer is:<ref>[[#c-std|ISO/IEC 9899]], clause 6.7.5.1, paragraph 1.</ref>

<syntaxhighlight lang="C">int *ptr;</syntaxhighlight>

This declares <code>ptr</code> as the identifier of an object of the following type:
* pointer that points to an object of type <code>int</code>
This is usually stated more succinctly as "<code>ptr</code> is a pointer to <code>int</code>."

Because the C language does not specify an implicit initialization for objects of automatic storage duration,<ref>[[#c-std|ISO/IEC 9899]], clause 6.7.8, paragraph 10.</ref> care should often be taken to ensure that the address to which <code>ptr</code> points is valid; this is why it is sometimes suggested that a pointer be explicitly initialized to the [[null pointer]] value, which is traditionally specified in C with the standardized macro <code>NULL</code>:<ref name="c-NULL">[[#c-std|ISO/IEC 9899]], clause 7.17, paragraph 3: ''NULL... which expands to an implementation-defined null pointer constant...''</ref>

<syntaxhighlight lang="C">int *ptr = NULL;</syntaxhighlight>

Dereferencing a null pointer in C produces [[undefined behavior]],<ref>[[#c-std|ISO/IEC 9899]], clause 6.5.3.2, paragraph 4, footnote 87: ''If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined... Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer...''</ref> which could be catastrophic. However, most implementations{{citation needed|date=July 2011}} simply halt execution of the program in question, usually with a [[segmentation fault]].

However, initializing pointers unnecessarily could hinder program analysis, thereby hiding bugs.

In any case, once a pointer has been declared, the next logical step is for it to point at something:

<syntaxhighlight lang="C">
int a = 5;
int *ptr = NULL;

ptr = &a;
</syntaxhighlight>

This assigns the value of the address of <code>a</code> to <code>ptr</code>. For example, if <code>a</code> is stored at memory location of 0x8130 then the value of <code>ptr</code> will be 0x8130 after the assignment. To dereference the pointer, an asterisk is used again:

<syntaxhighlight lang="C">*ptr = 8;</syntaxhighlight>

This means take the contents of <code>ptr</code> (which is 0x8130), "locate" that address in memory and set its value to 8.
If <code>a</code> is later accessed again, its new value will be 8.

This example may be clearer if memory is examined directly.
Assume that <code>a</code> is located at address 0x8130 in memory and <code>ptr</code> at 0x8134; also assume this is a 32-bit machine such that an int is 32-bits wide. The following is what would be in memory after the following code snippet is executed:

<syntaxhighlight lang="C">
int a = 5;
int *ptr = NULL;
</syntaxhighlight>

{| class="wikitable" style="padding-left: 2em;"
! Address !! Contents
|-
| '''0x8130''' || 0x00000005
|-
| '''0x8134''' || 0x00000000
|}

(The NULL pointer shown here is 0x00000000.)
By assigning the address of <code>a</code> to <code>ptr</code>:

<syntaxhighlight lang="C">ptr = &a;</syntaxhighlight>

yields the following memory values:

{| class="wikitable" style="padding-left: 2em;"
! Address !! Contents
|-
| '''0x8130''' || 0x00000005
|-
| '''0x8134''' || 0x00008130
|}

Then by dereferencing <code>ptr</code> by coding:

<syntaxhighlight lang="C">*ptr = 8;</syntaxhighlight>

the computer will take the contents of <code>ptr</code> (which is 0x8130), 'locate' that address, and assign 8 to that location yielding the following memory:

{| class="wikitable" style="padding-left: 2em;"
! Address !! Contents
|-
| '''0x8130''' || 0x00000008
|-
| '''0x8134''' || 0x00008130
|}

Clearly, accessing <code>a</code> will yield the value of 8 because the previous instruction modified the contents of <code>a</code> by way of the pointer <code>ptr</code>.

===Use in data structures===
When setting up [[data structure]]s like [[List (computing)|lists]], [[Queue (abstract data type)|queues]] and trees, it is necessary to have pointers to help manage how the structure is implemented and controlled. Typical examples of pointers are start pointers, end pointers, and [[Stack (abstract data type)|stack]] pointers. These pointers can either be '''absolute''' (the actual [[physical address]] or a [[virtual address]] in [[virtual memory]]) or '''relative''' (an [[offset (computer science)|offset]] from an absolute start address ("base") that typically uses fewer bits than a full address, but will usually require one additional arithmetic operation to resolve).

Relative addresses are a form of manual [[memory segmentation]], and share many of its advantages and disadvantages. A two-byte offset, containing a 16-bit, unsigned integer, can be used to provide relative addressing for up to 64 [[kibibytes|KiB]] (2<sup>16</sup> bytes) of a data structure. This can easily be extended to 128, 256 or 512 KiB if the address pointed to is forced to be [[Data structure alignment|aligned]] on a half-word, word or double-word boundary (but, requiring an additional "shift left" [[bitwise operation]]—by 1, 2 or 3 bits—in order to adjust the offset by a factor of 2, 4 or 8, before its addition to the base address). Generally, though, such schemes are a lot of trouble, and for convenience to the programmer absolute addresses (and underlying that, a ''[[flat address space]]'') is preferred.

A one byte offset, such as the hexadecimal [[ASCII]] value of a character (e.g. X'29') can be used to point to an alternative integer value (or index) in an array (e.g., X'01'). In this way, characters can be very efficiently translated from '[[raw data]]' to a usable sequential [[Array data structure|index]] and then to an absolute address without a [[lookup table]].

====C arrays====
In C, array indexing is formally defined in terms of pointer arithmetic; that is, the language specification requires that <code>array[i]</code> be equivalent to <code>*(array + i)</code>.<ref name="Plauger1992">{{cite book |title=ANSI and ISO Standard C Programmer's Reference |last=Plauger |first=P J |author-link=P. J. Plauger |author2=Brodie, Jim |year=1992 |publisher=Microsoft Press |location=Redmond, WA |isbn=978-1-55615-359-4 |pages=[https://archive.org/details/ansiisostandardc00plau/page/108 108, 51] |quote=An array type does not contain additional holes because all other types pack tightly when composed into arrays ''[at page 51]'' |url-access=registration |url=https://archive.org/details/ansiisostandardc00plau/page/108}}</ref> Thus in C, arrays can be thought of as pointers to consecutive areas of memory (with no gaps),<ref name="Plauger1992" /> and the syntax for accessing arrays is identical for that which can be used to dereference pointers. For example, an array <code>array</code> can be declared and used in the following manner:

<syntaxhighlight lang="c">
int array[5];      /* Declares 5 contiguous integers */
int *ptr = array;  /* Arrays can be used as pointers */
ptr[0] = 1;        /* Pointers can be indexed with array syntax */
*(array + 1) = 2;  /* Arrays can be dereferenced with pointer syntax */
*(1 + array) = 2;  /* Pointer addition is commutative */
2[array] = 4;      /* Subscript operator is commutative */
</syntaxhighlight>

This allocates a block of five integers and names the block <code>array</code>, which acts as a pointer to the block. Another common use of pointers is to point to dynamically allocated memory from [[malloc]] which returns a consecutive block of memory of no less than the requested size that can be used as an array.

While most operators on arrays and pointers are equivalent, the result of the <code>[[Sizeof#Using sizeof with arrays|sizeof]]</code> operator differs. In this example, <code>sizeof(array)</code> will evaluate to <code>5*sizeof(int)</code> (the size of the array), while <code>sizeof(ptr)</code> will evaluate to <code>sizeof(int*)</code>, the size of the pointer itself.

Default values of an array can be declared like:

<syntaxhighlight lang="C">
int array[5] = {2, 4, 3, 1, 5};
</syntaxhighlight>

If <code>array</code> is located in memory starting at address 0x1000 on a 32-bit [[Endianness#Little-endian|little-endian]] machine then memory will contain the following (values are in [[hexadecimal]], like the addresses):

:{| class="wikitable" style="font-family:monospace;"
|-
|  
! 0 || 1 || 2 || 3
|-
! 1000 
| 2 || 0 || 0 || 0
|-
! 1004 
| 4 || 0 || 0 || 0
|-
! 1008 
| 3 || 0 || 0 || 0
|-
! 100C 
| 1 || 0 || 0 || 0
|-
! 1010 
| 5 || 0 || 0 || 0
|}
Represented here are five integers: 2, 4, 3, 1, and 5. These five integers occupy 32 bits (4 bytes) each with the least-significant byte stored first (this is a little-endian [[CPU architecture]]) and are stored consecutively starting at address 0x1000.

The syntax for C with pointers is:
* <code>array</code> means 0x1000;
* <code>array + 1</code> means 0x1004: the "+ 1" means to add the size of 1 <code>int</code>, which is 4 bytes;
* <code>*array</code> means to dereference the contents of <code>array</code>. Considering the contents as a memory address (0x1000), look up the value at that location (0x0002);
* <code>array[i]</code> means element number <code>i</code>, 0-based, of <code>array</code> which is translated into <code>*(array + i)</code>.

The last example is how to access the contents of <code>array</code>. Breaking it down:
* <code>array + i</code> is the memory location of the (i)<sup>th</sup> element of <code>array</code>, starting at i=0;
* <code>*(array + i)</code> takes that memory address and dereferences it to access the value.

====C linked list====
Below is an example definition of a [[linked list]] in C.

<syntaxhighlight lang="C">
/* the empty linked list is represented by NULL
 * or some other sentinel value */
#define EMPTY_LIST  NULL

struct link {
    void        *data;  /* data of this link */
    struct link *next;  /* next link; EMPTY_LIST if there is none */
};
</syntaxhighlight>

This pointer-recursive definition is essentially the same as the reference-recursive definition from the language [[Haskell]]:

<syntaxhighlight lang="haskell">
 data Link a = Nil
             | Cons a (Link a)
</syntaxhighlight>
<code>Nil</code> is the empty list, and <code>Cons a (Link a)</code> is a [[cons]] cell of type <code>a</code> with another link also of type <code>a</code>.

The definition with references, however, is type-checked and does not use potentially confusing signal values. For this reason, data structures in C are usually dealt with via [[wrapper function]]s, which are carefully checked for correctness.

===Pass-by-address using pointers===
Pointers can be used to pass variables by their address, allowing their value to be changed. For example, consider the following [[C (programming language)|C]] code:

<syntaxhighlight lang="C">
/* a copy of the int n can be changed within the function without affecting the calling code */
void passByValue(int n) {
    n = 12;
}

/* a pointer m is passed instead. No copy of the value pointed to by m is created */
void passByAddress(int *m) {
    *m = 14;
}

int main(void) {
    int x = 3;

    /* pass a copy of x's value as the argument */
    passByValue(x);
    // the value was changed inside the function, but x is still 3 from here on

    /* pass x's address as the argument */
    passByAddress(&x);
    // x was actually changed by the function and is now equal to 14 here

    return 0;
}
</syntaxhighlight>

===Dynamic memory allocation===
In some programs, the required amount of memory depends on what ''the user'' may enter. In such cases the programmer needs to allocate memory dynamically. This is done by allocating memory at the ''heap'' rather than on the ''stack'', where variables usually are stored (although variables can also be stored in the CPU registers). Dynamic memory allocation can only be made through pointers, and names &ndash; like with common variables &ndash; cannot be given.

Pointers are used to store and manage the addresses of [[dynamic memory allocation|dynamically allocated]] blocks of memory. Such blocks are used to store data objects or arrays of objects. Most structured and object-oriented languages provide an area of memory, called the ''heap'' or ''free store'', from which objects are dynamically allocated.

The example C code below illustrates how structure objects are dynamically allocated and referenced. The [[standard C library]] provides the function [[malloc|<code>malloc()</code>]] for allocating memory blocks from the heap. It takes the size of an object to allocate as a parameter and returns a pointer to a newly allocated block of memory suitable for storing the object, or it returns a null pointer if the allocation failed.

<syntaxhighlight lang="C">
/* Parts inventory item */
struct Item {
    int         id;     /* Part number */
    char *      name;   /* Part name   */
    float       cost;   /* Cost        */
};

/* Allocate and initialize a new Item object */
struct Item * make_item(const char *name) {
    struct Item * item;

    /* Allocate a block of memory for a new Item object */
    item = malloc(sizeof(struct Item));
    if (item == NULL)
        return NULL;

    /* Initialize the members of the new Item */
    memset(item, 0, sizeof(struct Item));
    item->id =   -1;
    item->name = NULL;
    item->cost = 0.0;

    /* Save a copy of the name in the new Item */
    item->name = malloc(strlen(name) + 1);
    if (item->name == NULL) {
        free(item);
        return NULL;
    }
    strcpy(item->name, name);

    /* Return the newly created Item object */
    return item;
}
</syntaxhighlight>

The code below illustrates how memory objects are dynamically deallocated, i.e., returned to the heap or free store. The standard C library provides the function [[free()|<code>free()</code>]] for deallocating a previously allocated memory block and returning it back to the heap.

<syntaxhighlight lang="C">
/* Deallocate an Item object */
void destroy_item(struct Item *item) {
    /* Check for a null object pointer */
    if (item == NULL)
        return;

    /* Deallocate the name string saved within the Item */
    if (item->name != NULL) {
        free(item->name);
        item->name = NULL;
    }

    /* Deallocate the Item object itself */
    free(item);
}
</syntaxhighlight>

===Memory-mapped hardware===
On some computing architectures, pointers can be used to directly manipulate memory or memory-mapped devices.

Assigning addresses to pointers is an invaluable tool when programming [[microcontroller]]s. Below is a simple example declaring a pointer of type int and initialising it to a [[hexadecimal]] address in this example the constant 0x7FFF:

<syntaxhighlight lang="C">
int *hardware_address = (int *)0x7FFF;
</syntaxhighlight>

In the mid 80s, using the [[BIOS]] to access the video capabilities of PCs was slow. Applications that were display-intensive typically used to access [[Color Graphics Adapter|CGA]] video memory directly by casting the [[hexadecimal]] constant 0xB8000 to a pointer to an array of 80 unsigned 16-bit int values. Each value consisted of an [[ASCII]] code in the low byte, and a colour in the high byte. Thus, to put the letter 'A' at row 5, column 2 in bright white on blue, one would write code like the following:

<syntaxhighlight lang="C">
#define VID ((unsigned short (*)[80])0xB8000)

void foo(void) {
    VID[4][1] = 0x1F00 | 'A';
}
</syntaxhighlight>

===Use in control tables===
{{see also|#Function pointer|#Wild branch}}
[[Control table]]s that are used to control [[program flow]] usually make extensive use of pointers. The pointers, usually embedded in a table entry, may, for instance, be used to hold the entry points to [[subroutine]]s to be executed, based on certain conditions defined in the same table entry. The pointers can however be simply indexes to other separate, but associated, tables comprising an array of the actual addresses or the addresses themselves (depending upon the programming language constructs available). They can also be used to point to earlier table entries (as in loop processing) or forward to skip some table entries (as in a [[Switch statement|switch]] or "early" exit from a loop). For this latter purpose, the "pointer" may simply be the table entry number itself and can be transformed into an actual address by simple arithmetic.