Editing Compositional data (section)

== Aitchison geometry ==
The simplex can be given the structure of a [[vector space]] in several different ways. The following vector space structure is called '''Aitchison geometry''' or the '''Aitchison simplex''' and has the following operations:

; Perturbation (vector addition)

:: <math> x \oplus y = \left[\frac{x_1 y_1}{\sum_{i=1}^D x_i y_i},\frac{x_2 y_2}{\sum_{i=1}^D x_i y_i}, \dots, \frac{x_D y_D}{\sum_{i=1}^D x_i y_i}\right] = C[x_1 y_1, \ldots, x_D y_D]  \qquad \forall x, y \in S^D
</math>

; Powering (scalar multiplication)

:: <math> \alpha \odot x = \left[\frac{x_1^\alpha}{\sum_{i=1}^D x_i^\alpha},\frac{x_2^\alpha}{\sum_{i=1}^D x_i^\alpha}, \ldots,\frac{x_D^\alpha}{\sum_{i=1}^D x_i^\alpha} \right] = C[x_1^\alpha, \ldots, x_D^\alpha]  \qquad \forall x \in S^D, \; \alpha \in \mathbb{R}
</math>

; Inner product

:: <math> \langle x, y \rangle = \frac{1}{2D} 
\sum_{i=1}^D 
\sum_{j=1}^D
\log \frac{x_i}{x_j} 
\log \frac{y_i}{y_j} 
\qquad \forall x, y \in S^D</math>

Endowed with those operations, the Aitchison simplex forms a <math>(D-1)</math>-dimensional Euclidean [[inner product space]]. The uniform composition <math>\left[\frac{1}{D}, \dots,  \frac{1}{D}\right]</math> is the [[zero vector]].

=== Orthonormal bases ===
Since the Aitchison simplex forms a finite dimensional Hilbert space, it is possible to construct orthonormal bases in the simplex. Every composition <math>x</math> can be decomposed as follows

:: <math> x = \bigoplus_{i=1}^{D-1} x_i^* \odot e_i </math>

where <math>e_1, \ldots, e_{D-1} </math> forms an orthonormal basis in the simplex.<ref>{{harvnb|Egozcue|Pawlowsky-Glahn|Mateu-Figueras|Barcelo-Vidal2003}}</ref> The values <math>x_i^*, i=1,2,\ldots,D-1</math> are the (orthonormal and Cartesian) coordinates of <math>x</math> with respect to the given basis. They are called isometric log-ratio coordinates <math>(\operatorname{ilr})</math>.

=== Linear transformations ===
There are three well-characterized [[isomorphism]]s that transform from the Aitchison simplex to real space.  All of these transforms satisfy linearity and as given below

==== Additive log ratio transform ====
The additive log ratio (alr) transform is an isomorphism where <math>\operatorname{alr}: S^D \rightarrow \mathbb{R}^{D-1} </math>.  This is given by

:: <math> \operatorname{alr}(x) = \left[ \log \frac{x_1}{x_D}, \cdots, \log \frac{x_{D-1}}{x_D} \right] </math>

The choice of denominator component is arbitrary, and could be any specified component.
This transform is commonly used in chemistry with measurements such as pH.  In addition, this is the transform most commonly used for [[multinomial logistic regression]].  The alr transform is not an isometry, meaning that distances on transformed values will not be equivalent to distances on the original compositions in the simplex.

==== Center log ratio transform ====
The center log ratio (clr) transform is both an isomorphism and an isometry where <math>\operatorname{clr}: S^D \rightarrow U, \quad U \subset \mathbb{R}^D </math>

:: <math> \operatorname{clr}(x) = \left[ \log \frac{x_1}{g(x)}, \cdots, \log \frac{x_D}{g(x)} \right] </math>

Where <math> g(x) </math> is the geometric mean of <math> x </math>. The inverse of this function is also known as the [[softmax function]].

==== Isometric logratio transform ====
The isometric log ratio (ilr) transform is both an isomorphism and an isometry where <math>\operatorname{ilr}: S^D \rightarrow \mathbb{R}^{D-1} </math>

:: <math> \operatorname{ilr}(x) = \big[ \langle x, e_1 \rangle, \ldots, \langle x, e_{D-1} \rangle\big]
</math>

There are multiple ways to construct orthonormal bases, including using the [[Gram–Schmidt_process | Gram–Schmidt orthogonalization]] or [[singular-value decomposition]] of clr transformed data.  
Another alternative is to construct log contrasts from a bifurcating tree.  If we are given a bifurcating tree, we can construct a basis from the internal nodes in the tree.

[[File:Orthogonal-tree-basis.jpg|thumb|A representation of a tree in terms of its orthogonal components. l represents an internal node, an element of the orthonormal basis. This is a precursor to using the tree as a scaffold for the ilr transform]]

Each vector in the basis would be determined as follows

:: <math> e_\ell = C[\exp( \,\underbrace{0,\ldots,0}_k, \underbrace{a,\ldots,a}_r,\underbrace{b,\ldots,b}_s,\underbrace{0,\ldots,0}_t \, )]
</math>

The elements within each vector are given as follows

:: <math> a = \frac{\sqrt{s}}{\sqrt{r(r+s)}} \quad \text{and} \quad b = \frac{-\sqrt{r}}{\sqrt{s(r+s)}}
</math>

where <math>k, r, s, t</math> are the respective number of tips in the corresponding subtrees shown in the figure.  It can be shown that the resulting basis is orthonormal<ref>{{harvnb|Egozcue|Pawlowsky-Glahn|2005}}</ref>

Once the basis <math>\Psi</math> is built, the ilr transform can be calculated as follows

:: <math> 
\operatorname{ilr}(x) = \operatorname{clr}(x) \Psi^T
</math>

where each element in the ilr transformed data is of the following form

:: <math> b_i = \sqrt{\frac{rs}{r+s}} \log \frac{g(x_R)}{g(x_S)}  </math>

where <math> x_R</math> and <math> x_S</math> are the set of values corresponding to the tips in the subtrees <math> R</math> and <math> S</math>