Editing 3D projection (section)

===Mathematical formula===

The perspective projection requires a more involved definition as compared to orthographic projections. A conceptual aid to understanding the mechanics of this projection is to imagine the 2D projection as though the object(s) are being viewed through a camera viewfinder. The camera's position, orientation, and [[field of view]] control the behavior of the projection transformation. The following variables are defined to describe this transformation:
* <math>\mathbf{a}_{x,y,z}</math> – the 3D position of a point ''A'' that is to be projected
* <math>\mathbf{c}_{x,y,z}</math> – the 3D position of a point ''C'' representing the camera
* <math>\mathbf{\theta}_{x,y,z}</math> – The [[Orientation (geometry)|orientation]] of the camera (represented by [[Tait-Bryan angles|Tait–Bryan angles]])
* <math>\mathbf{e}_{x,y,z}</math> – the [[Image plane|display surface]]'s position relative to aforementioned <math>\mathbf{c}</math><ref>{{Cite journal
|author=Ingrid Carlbom, Joseph Paciorek
|title=Planar Geometric Projections and Viewing Transformations
|journal=[[ACM Computing Surveys]]
|volume=10
|issue=4
|pages=465–502
|year=1978
|doi=10.1145/356744.356750
|url=http://www.cs.uns.edu.ar/cg/clasespdf/p465carlbom.pdf
|citeseerx=10.1.1.532.4774
|s2cid=708008
}}</ref> 
Most conventions use positive z values (the plane being in front of the pinhole <math>\mathbf{c}</math>), however negative z values are physically more correct, but the image will be inverted both horizontally and vertically.
Which results in:
* <math>\mathbf{b}_{x,y}</math> – the 2D projection of <math>\mathbf{a}.</math>

When <math>\mathbf{c}_{x,y,z}=\langle 0,0,0\rangle,</math> and <math>\mathbf{\theta}_{x,y,z} = \langle 0,0,0\rangle,</math> the 3D vector <math>\langle 1,2,0 \rangle</math> is projected to the 2D vector <math>\langle 1,2 \rangle</math>.

Otherwise, to compute <math>\mathbf{b}_{x,y}</math> we first define a vector <math>\mathbf{d}_{x,y,z}</math> as the position of point ''A'' with respect to a [[coordinate system]] defined by the camera, with origin in ''C'' and rotated by <math>\mathbf{\theta}</math> with respect to the initial coordinate system.  This is achieved by [[Matrix addition|subtracting]] <math>\mathbf{c}</math> from <math>\mathbf{a}</math> and then applying a rotation by <math>-\mathbf{\theta}</math> to the result.  This transformation is often called a '''{{visible anchor|camera transform}}''', and can be expressed as follows, expressing the rotation in terms of rotations about the ''x,'' ''y,'' and ''z'' axes (these calculations assume that the axes are ordered as a [[Cartesian coordinates#Orientation and handedness|left-handed]] system of axes):
<!--Orthogonal transformation (pg 931) and Matrix and Adaptation for n-dimensional arbitrary rotation (pg 942):--><ref>{{cite book
  | last = Riley | first = K F
  | title = Mathematical Methods for Physics and Engineering
  | url = https://archive.org/details/mathematicalmeth00rile | url-access = registration | year = 2006
  | publisher = [[Cambridge University Press]]
  | pages = [https://archive.org/details/mathematicalmeth00rile/page/931 931], 942
  | isbn = 978-0-521-67971-8 }}
</ref>
<!--Related form, using rotation around intermediate axes--><ref>{{cite book
  | last = Goldstein| first = Herbert
  | title = Classical Mechanics
  | edition= 2nd
  | year = 1980
  | pages = 146–148
  | isbn = 978-0-201-02918-5
  | publisher = Addison-Wesley Pub. Co.
  | location = Reading, Mass. }}
</ref>
:<math>
\begin{bmatrix}
   \mathbf{d}_x \\
   \mathbf{d}_y \\
   \mathbf{d}_z
\end{bmatrix}=\begin{bmatrix}
   1 & 0 & 0  \\
   0 &  \cos ( \mathbf{\theta}_x ) & \sin ( \mathbf{ \theta}_x ) \\
   0 & -\sin ( \mathbf{ \theta}_x ) & \cos ( \mathbf{ \theta}_x )
\end{bmatrix}\begin{bmatrix}
   \cos ( \mathbf{\theta}_y ) & 0 & - \sin ( \mathbf{ \theta}_y ) \\
   0 & 1 & 0  \\
   \sin ( \mathbf{\theta}_y ) & 0 & \cos ( \mathbf{\theta}_y )
\end{bmatrix}\begin{bmatrix}
    \cos ( \mathbf{\theta}_z ) & \sin ( \mathbf{\theta}_z ) & 0  \\
   -\sin ( \mathbf{\theta}_z ) &   \cos ( \mathbf{\theta}_z ) & 0  \\
   0 & 0 & 1
\end{bmatrix}\left( {\begin{bmatrix}
   \mathbf{a}_x  \\
   \mathbf{a}_y  \\
   \mathbf{a}_z  \\
\end{bmatrix} - \begin{bmatrix}
   \mathbf{c}_x  \\
   \mathbf{c}_y  \\
   \mathbf{c}_z  \\
\end{bmatrix}} \right)
</math>
This representation corresponds to rotating by three [[Euler angles]] (more properly, [[Tait-Bryan angles|Tait–Bryan angles]]), using the ''xyz'' convention, which can be interpreted either as "rotate about the ''extrinsic'' axes (axes of the ''scene'') in the order ''z'', ''y'', ''x'' (reading right-to-left)" or "rotate about the ''intrinsic'' axes (axes of the ''camera'') in the order ''x, y, z'' (reading left-to-right)". If the camera is not rotated (<math>\mathbf{\theta}_{x,y,z} = \langle 0,0,0\rangle</math>), then the matrices drop out (as identities), and this reduces to simply a shift: <math>\mathbf{d} = \mathbf{a} - \mathbf{c}.</math>

Alternatively, without using matrices (let us replace <math>a_x - c_x</math> with <math>\mathbf{x}</math> and so on, and abbreviate <math>\cos\left(\theta_\alpha\right)</math> to <math>cos_\alpha</math> and <math>\sin\left(\theta_\alpha\right)</math> to <math>sin_\alpha</math>):
:<math>
\begin{align}
	\mathbf{d}_x & = cos_y (sin_z \mathbf{y}+cos_z \mathbf{x})-sin_y \mathbf{z} \\
	\mathbf{d}_y & = sin_x (cos_y \mathbf{z}+sin_y (sin_z \mathbf{y}+cos_z \mathbf{x}))+cos_x (cos_z \mathbf{y}-sin_z \mathbf{x}) \\
	\mathbf{d}_z & = cos_x (cos_y \mathbf{z}+sin_y (sin_z \mathbf{y}+cos_z \mathbf{x}))-sin_x (cos_z \mathbf{y}-sin_z \mathbf{x})
\end{align}
</math>
This transformed point can then be projected onto the 2D plane using the formula (here, ''x''/''y'' is used as the projection plane; literature also may use ''x''/''z''):<ref>
{{Cite book
  | last1 = Sonka | first1 = M
  | last2 = Hlavac | first2 = V
  | last3 = Boyle | first3 = R
  | title = Image Processing, Analysis & Machine Vision | edition= 2nd
  | publisher = Chapman and Hall
  | year = 1995
  | page = 14
  | isbn = 978-0-412-45570-4
  }}
</ref>
:<math>
\begin{align}
 \mathbf{b}_x &= \frac{\mathbf{e}_z}{\mathbf{d}_z} \mathbf{d}_x + \mathbf{e}_x, \\[5pt]
 \mathbf{b}_y &= \frac{\mathbf{e}_z}{\mathbf{d}_z} \mathbf{d}_y + \mathbf{e}_y.
\end{align}
</math>

Or, in matrix form using [[homogeneous coordinates]], the system
:<math>
\begin{bmatrix}
   \mathbf{f}_x \\
   \mathbf{f}_y \\
   \mathbf{f}_w
\end{bmatrix}=\begin{bmatrix}
   1 & 0 & \frac{\mathbf{e}_x}{\mathbf{e}_z} \\
   0 & 1 & \frac{\mathbf{e}_y}{\mathbf{e}_z} \\
   0 & 0 & \frac{1}{\mathbf{e}_z}
\end{bmatrix}\begin{bmatrix}
   \mathbf{d}_x  \\
   \mathbf{d}_y  \\
   \mathbf{d}_z
\end{bmatrix}
</math>
in conjunction with an argument using similar triangles, leads to division by the homogeneous coordinate, giving
:<math>
\begin{align}
 \mathbf{b}_x &= \mathbf{f}_x / \mathbf{f}_w \\
 \mathbf{b}_y &= \mathbf{f}_y / \mathbf{f}_w
\end{align}
</math>

The distance of the viewer from the display surface, <math>\mathbf{e}_z</math>, directly relates to the field of view, where <math>\alpha=2 \cdot \arctan(1/\mathbf{e}_z)</math> is the viewed angle. (Note: This assumes that you map the points (-1,-1) and (1,1) to the corners of your viewing surface)

The above equations can also be rewritten as:
:<math>
\begin{align}
 \mathbf{b}_x & = (\mathbf{d}_x \mathbf{s}_x ) / (\mathbf{d}_z \mathbf{r}_x) \mathbf{r}_z, \\
 \mathbf{b}_y & = (\mathbf{d}_y \mathbf{s}_y ) / (\mathbf{d}_z \mathbf{r}_y) \mathbf{r}_z.
\end{align}
</math>
In which <math>\mathbf{s}_{x,y}</math> is the display size, <math>\mathbf{r}_{x,y}</math> is the recording surface size ([[Charge-coupled device|CCD]] or [[Photographic film]]), <math>\mathbf{r}_z</math> is the distance from the recording surface to the [[entrance pupil]] ([[pinhole camera model|camera center]]), and <math>\mathbf{d}_z</math> is the distance, from the 3D point being projected, to the entrance pupil.

Subsequent clipping and scaling operations may be necessary to map the 2D plane onto any particular display media.