Editing Rotation matrix (section)

== Conversions ==
{{see also|Rotation formalisms in three dimensions#Conversion formulae between formalisms}}
We have seen the existence of several decompositions that apply in any dimension, namely independent planes, sequential angles, and nested dimensions. In all these cases we can either decompose a matrix or construct one. We have also given special attention to {{nowrap|3 × 3}} rotation matrices, and these warrant further attention, in both directions {{Harv|Stuelpnagel|1964}}.

=== Quaternion ===
{{Main|Quaternions and spatial rotation}}

Given the unit quaternion {{math|'''q''' {{=}} ''w'' + ''x'''''i''' + ''y'''''j''' + ''z'''''k'''}}, the equivalent pre-multiplied (to be used with column vectors) {{nowrap|3 × 3}} rotation matrix is <ref>{{cite conference |last1=Shoemake |first1=Ken |title=Computer Graphics: SIGGRAPH '85 Conference Proceedings |conference=SIGGRAPH '85, 22–26 July 1985, San Francisco |chapter=Animating rotation with quaternion curves |year=1985 |volume=19 |number=3 |pages=245–254 |doi=10.1145/325334.325242 |doi-access=free |url=https://archive.org/details/siggraph85confer0000sigg/page/n2/ |publisher=Association for Computing Machinery |isbn=0897911660 }}</ref>
:<math> Q = \begin{bmatrix}
    1 - 2 y^2 - 2 z^2 & 2 x y - 2 z w & 2 x z + 2 y w \\
    2 x y + 2 z w & 1 - 2 x^2 - 2 z^2 & 2 y z - 2 x w \\
    2 x z - 2 y w & 2 y z + 2 x w & 1 - 2 x^2 - 2 y^2
  \end{bmatrix}
.</math>

Now every [[quaternion]] component appears multiplied by two in a term of degree two, and if all such terms are zero what is left is an identity matrix. This leads to an efficient, robust conversion from any quaternion – whether unit or non-unit – to a {{nowrap|3 × 3}} rotation matrix. Given:
:<math>\begin{align}
   n &= w \times w + x \times x + y \times y + z \times z \\
   s &= \begin{cases}
          0           &\text{if } n = 0 \\
          \frac{2}{n} &\text{otherwise}
        \end{cases} \\
\end{align}</math>
we can calculate
:<math>Q = \begin{bmatrix}
  1 - s(yy + zz) &      s(xy - wz)  &      s(xz + wy) \\
      s(xy + wz) &  1 - s(xx + zz)  &      s(yz - wx) \\
      s(xz - wy) &      s(yz + wx)  &  1 - s(xx + yy)
\end{bmatrix}</math>
 
Freed from the demand for a unit quaternion, we find that nonzero quaternions act as [[homogeneous coordinates]] for {{nowrap|3 × 3}} rotation matrices. The Cayley transform, discussed earlier, is obtained by scaling the quaternion so that its {{mvar|w}} component is 1. For a 180° rotation around any axis, {{mvar|w}} will be zero, which explains the Cayley limitation.

The sum of the entries along the main diagonal (the [[trace (linear algebra)|trace]]), plus one, equals {{math|4 − 4(''x''<sup>2</sup> + ''y''<sup>2</sup> + ''z''<sup>2</sup>)}}, which is {{math|4''w''<sup>2</sup>}}. Thus we can write the trace itself as {{math|2''w''<sup>2</sup> + 2''w''<sup>2</sup> − 1}}; and from the previous version of the matrix we see that the diagonal entries themselves have the same form: {{math|2''x''<sup>2</sup> + 2''w''<sup>2</sup> − 1}}, {{math|2''y''<sup>2</sup> + 2''w''<sup>2</sup> − 1}}, and {{math|2''z''<sup>2</sup> + 2''w''<sup>2</sup> − 1}}. So we can easily compare the magnitudes of all four quaternion components using the matrix diagonal. We can, in fact, obtain all four magnitudes using sums and square roots, and choose consistent signs using the skew-symmetric part of the off-diagonal entries:
:<math>\begin{align}
  t &= \operatorname{tr} Q = Q_{xx} + Q_{yy} + Q_{zz} \quad (\text{the trace of }Q) \\
  r &= \sqrt{1 + t} \\
  w &= \tfrac{1}{2} r \\
  x &= \operatorname{sgn}\left(Q_{zy} - Q_{yz}\right)\left|\tfrac12 \sqrt{1 + Q_{xx} - Q_{yy} - Q_{zz}}\right|  \\
  y &= \operatorname{sgn}\left(Q_{xz} - Q_{zx}\right)\left|\tfrac12 \sqrt{1 - Q_{xx} + Q_{yy} - Q_{zz}}\right| \\
  z &= \operatorname{sgn}\left(Q_{yx} - Q_{xy}\right)\left|\tfrac12 \sqrt{1 - Q_{xx} - Q_{yy} + Q_{zz}}\right|
\end{align}</math>

Alternatively, use a single square root and division
:<math>\begin{align}
  t &= \operatorname{tr} Q = Q_{xx} + Q_{yy} + Q_{zz} \\
  r &= \sqrt{1 + t} \\
  s &= \tfrac{1}{2r} \\
  w &= \tfrac{1}{2} r \\
  x &= \left(Q_{zy} - Q_{yz}\right)s \\
  y &= \left(Q_{xz} - Q_{zx}\right)s \\
  z &= \left(Q_{yx} - Q_{xy}\right)s
\end{align}</math>

This is numerically stable so long as the trace, {{mvar|t}}, is not negative; otherwise, we risk dividing by (nearly) zero. In that case, suppose {{mvar|Q<sub>xx</sub>}} is the largest diagonal entry, so {{mvar|x}} will have the largest magnitude (the other cases are derived by cyclic permutation); then the following is safe.
:<math>\begin{align}
  r &= \sqrt{1 + Q_{xx} - Q_{yy} - Q_{zz}} \\
  s &= \tfrac{1}{2r} \\
  w &= \left(Q_{zy} - Q_{yz}\right)s \\
  x &= \tfrac12 r \\
  y &= \left(Q_{xy} + Q_{yx}\right)s \\
  z &= \left(Q_{zx} + Q_{xz}\right)s
\end{align}</math>

If the matrix contains significant error, such as accumulated numerical error, we may construct a symmetric {{nowrap|4 × 4}} matrix,
:<math> K = \frac13 \begin{bmatrix}
  Q_{xx}-Q_{yy}-Q_{zz}    &        Q_{yx}+Q_{xy}        &        Q_{zx}+Q_{xz}        &        Q_{zy}-Q_{yz}        \\
         Q_{yx}+Q_{xy}    & Q_{yy}-Q_{xx}-Q_{zz}        &        Q_{zy}+Q_{yz}        &        Q_{xz}-Q_{zx}        \\
         Q_{zx}+Q_{xz}    &        Q_{zy}+Q_{yz}        & Q_{zz}-Q_{xx}-Q_{yy}        &        Q_{yx}-Q_{xy}        \\
         Q_{zy}-Q_{yz}    &        Q_{xz}-Q_{zx}        &        Q_{yx}-Q_{xy}        & Q_{xx}+Q_{yy}+Q_{zz}  
 \end{bmatrix} ,</math>
and find the [[eigenvector]], {{math|(''x'', ''y'', ''z'', ''w'')}}, of its largest magnitude eigenvalue. (If {{mvar|Q}} is truly a rotation matrix, that value will be 1.) The quaternion so obtained will correspond to the rotation matrix closest to the given matrix {{Harv|Bar-Itzhack|2000}} (Note: formulation of the cited article is post-multiplied, works with row vectors).

=== Polar decomposition ===
If the {{math|''n'' × ''n''}} matrix {{mvar|M}} is nonsingular, its columns are linearly independent vectors; thus the [[Gram–Schmidt process]] can adjust them to be an orthonormal basis. Stated in terms of [[numerical linear algebra]], we convert {{mvar|M}} to an orthogonal matrix, {{mvar|Q}}, using [[QR decomposition]]. However, we often prefer a {{mvar|Q}} closest to {{mvar|M}}, which this method does not accomplish. For that, the tool we want is the [[polar decomposition]] ({{Harvnb|Fan|Hoffman|1955}}; {{Harvnb|Higham|1989}}).

To measure closeness, we may use any [[matrix norm]] invariant under orthogonal transformations. A convenient choice is the [[Frobenius norm]], {{math|{{norm|''Q'' − ''M''}}<sub>F</sub>}}, squared, which is the sum of the squares of the element differences. Writing this in terms of the [[trace (linear algebra)|trace]], {{math|Tr}}, our goal is,
: Find {{mvar|Q}} minimizing {{math|Tr( (''Q'' − ''M'')<sup>T</sup>(''Q'' − ''M'') )}}, subject to {{math|''Q''<sup>T</sup>''Q'' {{=}} ''I''}}.

Though written in matrix terms, the [[objective function]] is just a quadratic polynomial. We can minimize it in the usual way, by finding where its derivative is zero. For a {{nowrap|3 × 3}} matrix, the orthogonality constraint implies six scalar equalities that the entries of {{mvar|Q}} must satisfy. To incorporate the constraint(s), we may employ a standard technique, [[Lagrange multipliers]], assembled as a symmetric matrix, {{mvar|Y}}. Thus our method is:
: Differentiate {{math|Tr( (''Q'' − ''M'')<sup>T</sup>(''Q'' − ''M'') + (''Q''<sup>T</sup>''Q'' − ''I'')''Y'' )}} with respect to (the entries of) {{mvar|Q}}, and equate to zero.

<div style="float:right; font-size:90%; border:1px solid black; padding:1em;">
Consider a {{nowrap|2 × 2}} example. Including constraints, we seek to minimize
:<math>\begin{align}
 &\left(Q_{xx} - M_{xx}\right)^2 + \left(Q_{xy} - M_{xy}\right)^2 + \left(Q_{yx} - M_{yx}\right)^2 + \left(Q_{yy} - M_{yy}\right)^2 \\
 &\quad {}+ \left(Q_{xx}^2 + Q_{yx}^2 - 1\right)Y_{xx} + \left(Q_{xy}^2 + Q_{yy}^2 - 1\right)Y_{yy} + 2\left(Q_{xx} Q_{xy} + Q_{yx} Q_{yy}\right)Y_{xy} .
\end{align}</math>

Taking the derivative with respect to {{mvar|Q<sub>xx</sub>}}, {{mvar|Q<sub>xy</sub>}}, {{mvar|Q<sub>yx</sub>}}, {{mvar|Q<sub>yy</sub>}} in turn, we assemble a matrix.
:<math>2\begin{bmatrix}
  Q_{xx} - M_{xx} + Q_{xx} Y_{xx} + Q_{xy} Y_{xy} & Q_{xy} - M_{xy} + Q_{xx} Y_{xy} + Q_{xy} Y_{yy} \\
  Q_{yx} - M_{yx} + Q_{yx} Y_{xx} + Q_{yy} Y_{xy} & Q_{yy} - M_{yy} + Q_{yx} Y_{xy} + Q_{yy} Y_{yy}
\end{bmatrix}</math>
</div>
In general, we obtain the equation
:<math> 0 = 2(Q - M) + 2QY , </math>
so that
:<math> M = Q(I + Y) = QS , </math>
where {{mvar|Q}} is orthogonal and {{mvar|S}} is symmetric. To ensure a minimum, the {{mvar|Y}} matrix (and hence {{mvar|S}}) must be positive definite. Linear algebra calls {{mvar|QS}} the [[polar decomposition]] of {{mvar|M}}, with {{mvar|S}} the positive square root of {{math|''S''<sup>2</sup> {{=}} ''M''<sup>T</sup>''M''}}.
:<math> S^2 = \left(Q^\mathsf{T} M\right)^\mathsf{T} \left(Q^\mathsf{T} M\right) = M^\mathsf{T} Q Q^\mathsf{T} M = M^\mathsf{T} M </math>

When {{mvar|M}} is [[non-singular matrix|non-singular]], the {{mvar|Q}} and {{mvar|S}} factors of the polar decomposition are uniquely determined. However, the determinant of {{mvar|S}} is positive because {{mvar|S}} is positive definite, so {{mvar|Q}} inherits the sign of the determinant of {{mvar|M}}. That is, {{mvar|Q}} is only guaranteed to be orthogonal, not a rotation matrix. This is unavoidable; an {{mvar|M}} with negative determinant has no uniquely defined closest rotation matrix.

=== Axis and angle ===
{{Main|Axis–angle representation}}
To efficiently construct a rotation matrix {{mvar|Q}} from an angle {{mvar|θ}} and a unit axis {{math|'''u'''}}, we can take advantage of symmetry and skew-symmetry within the entries. If {{mvar|x}}, {{mvar|y}}, and {{mvar|z}} are the components of the unit vector representing the axis, and

:<math>\begin{align}
c &= \cos \theta\\
s &= \sin \theta\\
C &= 1-c
\end{align}</math>

then

:<math>Q(\theta) = \begin{bmatrix}
xxC+c  & xyC-zs & xzC+ys\\
yxC+zs & yyC+c  & yzC-xs\\
zxC-ys & zyC+xs & zzC+c
\end{bmatrix}</math>

Determining an axis and angle, like determining a quaternion, is only possible up to the sign; that is, {{math|('''u''', ''θ'')}} and {{math|(−'''u''', −''θ'')}} correspond to the same rotation matrix, just like {{math|''q''}} and {{math|−''q''}}. Additionally, axis–angle extraction presents additional difficulties. The angle can be restricted to be from 0° to 180°, but angles are formally ambiguous by multiples of 360°. When the angle is zero, the axis is undefined. When the angle is 180°, the matrix becomes symmetric, which has implications in extracting the axis. Near multiples of 180°, care is needed to avoid numerical problems: in extracting the angle, a [[Atan2|two-argument arctangent]] with {{math|[[atan2]](sin ''θ'', cos ''θ'')}} equal to {{mvar|θ}} avoids the insensitivity of arccos; and in computing the axis magnitude in order to force unit magnitude, a brute-force approach can lose accuracy through underflow {{Harv|Moler|Morrison|1983}}.

A partial approach is as follows:

:<math>\begin{align}
 x &= Q_{zy} - Q_{yz}\\
 y &= Q_{xz} - Q_{zx}\\
 z &= Q_{yx} - Q_{xy}\\
 r &= \sqrt{x^2 + y^2 + z^2}\\
 t &= Q_{xx} + Q_{yy} + Q_{zz}\\
 \theta &= \operatorname{atan2}(r,t-1)\end{align}</math>

The {{mvar|x}}-, {{mvar|y}}-, and {{mvar|z}}-components of the axis would then be divided by {{mvar|r}}. A fully robust approach will use a different algorithm when {{mvar|t}}, the [[Trace (linear algebra)|trace]] of the matrix {{mvar|Q}}, is negative, as with quaternion extraction. When {{mvar|r}} is zero because the angle is zero, an axis must be provided from some source other than the matrix.

=== Euler angles ===
Complexity of conversion escalates with [[Euler angles]] (used here in the broad sense). The first difficulty is to establish which of the twenty-four variations of Cartesian axis order we will use. Suppose the three angles are {{math|''θ''<sub>1</sub>}}, {{math|''θ''<sub>2</sub>}}, {{math|''θ''<sub>3</sub>}}; physics and chemistry may interpret these as
<!--
 Don't even THINK of "fixing" this; it is NOT a typo. "zyz" is correct.
 -->
:<math> Q(\theta_1,\theta_2,\theta_3)=  Q_{\mathbf{z}}(\theta_1) Q_{\mathbf{y}}(\theta_2) Q_{\mathbf{z}}(\theta_3) , </math>
<!--                                               ^ --- That's right, this is MEANT to be a "z", not an "x". -->
while aircraft dynamics may use
:<math> Q(\theta_1,\theta_2,\theta_3)=  Q_{\mathbf{z}}(\theta_3) Q_{\mathbf{y}}(\theta_2) Q_{\mathbf{x}}(\theta_1) . </math>
One systematic approach begins with choosing the rightmost axis. Among all [[permutation]]s of {{math|(''x'',''y'',''z'')}}, only two place that axis first; one is an even permutation and the other odd. Choosing parity thus establishes the middle axis. That leaves two choices for the left-most axis, either duplicating the first or not. These three choices gives us {{nowrap|3 × 2 × 2 {{=}} 12}} variations; we double that to 24 by choosing static or rotating axes.

This is enough to construct a matrix from angles, but triples differing in many ways can give the same rotation matrix. For example, suppose we use the {{math|'''zyz'''}} convention above; then we have the following equivalent pairs:
:{| style="text-align:right"
| (90°,||45°,||−105°) || ≡ || (−270°,||−315°,||255°) || ''multiples of 360°''
|-
| (72°,||0°,||0°) || ≡ || (40°,||0°,||32°) || ''singular alignment''
|-
| (45°,||60°,||−30°) || ≡ || (−135°,||−60°,||150°) || ''bistable flip''
|}
Angles for any order can be found using a concise common routine ({{Harvnb|Herter|Lott|1993}}; {{Harvnb|Shoemake|1994}}).

The problem of singular alignment, the mathematical analog of physical [[gimbal lock]], occurs when the middle rotation aligns the axes of the first and last rotations. It afflicts every axis order at either even or odd multiples of 90°. These singularities are not characteristic of the rotation matrix as such, and only occur with the usage of Euler angles.

The singularities are avoided when considering and manipulating the rotation matrix as orthonormal row vectors (in 3D applications often named the right-vector, up-vector and out-vector) instead of as angles. The singularities are also avoided when working with quaternions.

=== Vector to vector formulation ===
In some instances it is interesting to describe a rotation by specifying how a vector is mapped into another through the shortest path (smallest angle). In <math>\mathbb{R}^3</math> this completely describes the associated rotation matrix. In general, given {{math|''x'', ''y'' ∈ <math>\mathbb{S}</math><sup>''n''</sup>}}, the matrix
:<math>R:=I+y x^\mathsf{T}-x y^\mathsf{T}+\frac{1}{1+\langle x,y\rangle}\left(yx^\mathsf{T}-xy^\mathsf{T}\right)^2</math>
belongs to {{math|SO(''n'' + 1)}} and maps {{mvar|x}} to {{mvar|y}}.<ref>{{cite journal |last1=Cid |first1=Jose Ángel |last2=Tojo |first2=F. Adrián F. |title=A Lipschitz condition along a transversal foliation implies local uniqueness for ODEs |journal=Electronic Journal of Qualitative Theory of Differential Equations |year=2018 |volume=13 |issue=13 |pages=1–14 |doi=10.14232/ejqtde.2018.1.13 |arxiv=1801.01724 |url=http://www.math.u-szeged.hu/ejqtde/periodica.html?periodica=1&paramtipus_ertek=publication&param_ertek=6497|doi-access=free }}</ref>

=== Voigt notation ===
In [[materials science]], the four-dimensional [[stiffness]] and compliance [[tensors]] are often simplified to a two-dimensional matrix using [[Voigt notation]]. When applying a rotational transform through angle <math> \theta </math> in this notation, the rotation matrix is given by<ref>Clyne, T. W., & Hull, D. (2019). Tensor Analysis of Anisotropic Materials and the Elastic Deformation of Laminae. In An Introduction to Composite Materials (pp. 43–66). chapter, Cambridge: Cambridge University Press.</ref>

:<math> T = \begin{bmatrix}
    \cos^2\theta & \sin^2\theta & 2\sin\theta\cos\theta \\
    \sin^2\theta & \cos^2\theta & 2\sin\theta\cos\theta \\
    -\sin\theta\cos\theta & \sin\theta\cos\theta & \cos^2\theta - \sin^2\theta
  \end{bmatrix}
.</math>

This is particularly useful in [[composite laminate]] design, where plies are often rotated by a certain angle to bring the properties of the laminate closer to [[isotropic]].