Editing 3D projection (section)

== Perspective projection ==
{{See also|Perspective (graphical)|Transformation matrix#Perspective projection|l2=Transformation matrix|Camera matrix}}

[[File:Distance point.jpg|right|thumb|300px|Perspective of a geometric solid using two vanishing points. In this case, the map of the solid (orthogonal projection) is drawn below the perspective, as if bending the ground plane.]]
[[File:Axonometric scheme.jpg|right|thumb|300px|Axonometric projection of a scheme displaying the relevant elements of a vertical [[picture plane]] perspective. The standing point (P.S.) is located on the ground plane ''π'', and the point of view (P.V.) is right above it. P.P. is its projection on the picture plane ''α''. L.O. and L.T. are the horizon and the ground lines (''linea d'orizzonte'' and ''linea di terra''). The bold lines '''s''' and '''q''' lie on ''π'', and intercept ''α'' at ''Ts'' and ''Tq'' respectively. The parallel lines through P.V. (in red) intercept L.O. in the vanishing points ''Fs'' and ''Fq'': thus one can draw the projections '''s′''' and '''q′''', and hence also their intersection '''R′''' on '''R'''.]]

Perspective projection or perspective transformation is a projection where three-dimensional objects are projected on a ''picture plane''. This has the effect that distant objects appear smaller than nearer objects.

It also means that lines which are parallel in nature (that is, meet at the [[point at infinity]]) appear to intersect in the projected image. For example, if railways are pictured with perspective projection, they appear to converge towards a single point, called the [[vanishing point]]. Photographic lenses and the human eye work in the same way, therefore the perspective projection looks the most realistic.<ref>D. Hearn, & M. Baker (1997). ''Computer Graphics, C Version''. Englewood Cliffs: Prentice Hall], chapter 9</ref> Perspective projection is usually categorized into ''one-point'', ''two-point'' and ''three-point perspective'', depending on the orientation of the projection plane towards the axes of the depicted object.<ref>James Foley (1997). ''Computer Graphics''. Boston: Addison-Wesley. {{ISBN|0-201-84840-6}}], chapter 6</ref>

Graphical projection methods rely on the duality between lines and points, whereby two straight lines determine a point while two points determine a straight line. The orthogonal projection of the eye point onto the picture plane is called the ''principal vanishing point'' (P.P. in the scheme on the right, from the Italian term ''punto principale'', coined during the renaissance).<ref>{{citation |author=Kirsti Andersen |author-link= Kirsti Andersen |title=The geometry of an art |isbn=9780387259611 |year=2007 |publisher=Springer |title-link=The Geometry of an Art|page=xxix}}</ref>

Two relevant points of a line are:
*its intersection with the picture plane, and
*its vanishing point, found at the intersection between the parallel line from the eye point and the picture plane.

The principal vanishing point is the vanishing point of all horizontal lines perpendicular to the picture plane. The vanishing points of all horizontal lines lie on the [[horizon]] line. If, as is often the case, the picture plane is vertical, all vertical lines are drawn vertically, and have no finite vanishing point on the picture plane. Various graphical methods can be easily envisaged for projecting geometrical scenes. For example, lines traced from the eye point at 45° to the picture plane intersect the latter along a circle whose radius is the distance of the eye point from the plane, thus tracing that circle aids the construction of all the vanishing points of 45° lines; in particular, the intersection of that circle with the horizon line consists of two ''distance points''. They are useful for drawing chessboard floors which, in turn, serve for locating the base of objects on the scene. In the perspective of a geometric solid on the right, after choosing the principal vanishing point —which determines the horizon line— the 45° vanishing point on the left side of the drawing completes the characterization of the (equally distant) point of view. Two lines are drawn from the orthogonal projection of each vertex, one at 45° and one at 90° to the picture plane. After intersecting the ground line, those lines go toward the distance point (for 45°) or the principal point (for 90°). Their new intersection locates the projection of the map. Natural heights are measured above the ground line and then projected in the same way until they meet the vertical from the map.

While orthographic projection ignores perspective to allow accurate measurements, perspective projection shows distant objects as smaller to provide additional realism.

===Mathematical formula===

The perspective projection requires a more involved definition as compared to orthographic projections. A conceptual aid to understanding the mechanics of this projection is to imagine the 2D projection as though the object(s) are being viewed through a camera viewfinder. The camera's position, orientation, and [[field of view]] control the behavior of the projection transformation. The following variables are defined to describe this transformation:
* <math>\mathbf{a}_{x,y,z}</math> – the 3D position of a point ''A'' that is to be projected
* <math>\mathbf{c}_{x,y,z}</math> – the 3D position of a point ''C'' representing the camera
* <math>\mathbf{\theta}_{x,y,z}</math> – The [[Orientation (geometry)|orientation]] of the camera (represented by [[Tait-Bryan angles|Tait–Bryan angles]])
* <math>\mathbf{e}_{x,y,z}</math> – the [[Image plane|display surface]]'s position relative to aforementioned <math>\mathbf{c}</math><ref>{{Cite journal
|author=Ingrid Carlbom, Joseph Paciorek
|title=Planar Geometric Projections and Viewing Transformations
|journal=[[ACM Computing Surveys]]
|volume=10
|issue=4
|pages=465–502
|year=1978
|doi=10.1145/356744.356750
|url=http://www.cs.uns.edu.ar/cg/clasespdf/p465carlbom.pdf
|citeseerx=10.1.1.532.4774
|s2cid=708008
}}</ref> 
Most conventions use positive z values (the plane being in front of the pinhole <math>\mathbf{c}</math>), however negative z values are physically more correct, but the image will be inverted both horizontally and vertically.
Which results in:
* <math>\mathbf{b}_{x,y}</math> – the 2D projection of <math>\mathbf{a}.</math>

When <math>\mathbf{c}_{x,y,z}=\langle 0,0,0\rangle,</math> and <math>\mathbf{\theta}_{x,y,z} = \langle 0,0,0\rangle,</math> the 3D vector <math>\langle 1,2,0 \rangle</math> is projected to the 2D vector <math>\langle 1,2 \rangle</math>.

Otherwise, to compute <math>\mathbf{b}_{x,y}</math> we first define a vector <math>\mathbf{d}_{x,y,z}</math> as the position of point ''A'' with respect to a [[coordinate system]] defined by the camera, with origin in ''C'' and rotated by <math>\mathbf{\theta}</math> with respect to the initial coordinate system.  This is achieved by [[Matrix addition|subtracting]] <math>\mathbf{c}</math> from <math>\mathbf{a}</math> and then applying a rotation by <math>-\mathbf{\theta}</math> to the result.  This transformation is often called a '''{{visible anchor|camera transform}}''', and can be expressed as follows, expressing the rotation in terms of rotations about the ''x,'' ''y,'' and ''z'' axes (these calculations assume that the axes are ordered as a [[Cartesian coordinates#Orientation and handedness|left-handed]] system of axes):
<!--Orthogonal transformation (pg 931) and Matrix and Adaptation for n-dimensional arbitrary rotation (pg 942):--><ref>{{cite book
  | last = Riley | first = K F
  | title = Mathematical Methods for Physics and Engineering
  | url = https://archive.org/details/mathematicalmeth00rile | url-access = registration | year = 2006
  | publisher = [[Cambridge University Press]]
  | pages = [https://archive.org/details/mathematicalmeth00rile/page/931 931], 942
  | isbn = 978-0-521-67971-8 }}
</ref>
<!--Related form, using rotation around intermediate axes--><ref>{{cite book
  | last = Goldstein| first = Herbert
  | title = Classical Mechanics
  | edition= 2nd
  | year = 1980
  | pages = 146–148
  | isbn = 978-0-201-02918-5
  | publisher = Addison-Wesley Pub. Co.
  | location = Reading, Mass. }}
</ref>
:<math>
\begin{bmatrix}
   \mathbf{d}_x \\
   \mathbf{d}_y \\
   \mathbf{d}_z
\end{bmatrix}=\begin{bmatrix}
   1 & 0 & 0  \\
   0 &  \cos ( \mathbf{\theta}_x ) & \sin ( \mathbf{ \theta}_x ) \\
   0 & -\sin ( \mathbf{ \theta}_x ) & \cos ( \mathbf{ \theta}_x )
\end{bmatrix}\begin{bmatrix}
   \cos ( \mathbf{\theta}_y ) & 0 & - \sin ( \mathbf{ \theta}_y ) \\
   0 & 1 & 0  \\
   \sin ( \mathbf{\theta}_y ) & 0 & \cos ( \mathbf{\theta}_y )
\end{bmatrix}\begin{bmatrix}
    \cos ( \mathbf{\theta}_z ) & \sin ( \mathbf{\theta}_z ) & 0  \\
   -\sin ( \mathbf{\theta}_z ) &   \cos ( \mathbf{\theta}_z ) & 0  \\
   0 & 0 & 1
\end{bmatrix}\left( {\begin{bmatrix}
   \mathbf{a}_x  \\
   \mathbf{a}_y  \\
   \mathbf{a}_z  \\
\end{bmatrix} - \begin{bmatrix}
   \mathbf{c}_x  \\
   \mathbf{c}_y  \\
   \mathbf{c}_z  \\
\end{bmatrix}} \right)
</math>
This representation corresponds to rotating by three [[Euler angles]] (more properly, [[Tait-Bryan angles|Tait–Bryan angles]]), using the ''xyz'' convention, which can be interpreted either as "rotate about the ''extrinsic'' axes (axes of the ''scene'') in the order ''z'', ''y'', ''x'' (reading right-to-left)" or "rotate about the ''intrinsic'' axes (axes of the ''camera'') in the order ''x, y, z'' (reading left-to-right)". If the camera is not rotated (<math>\mathbf{\theta}_{x,y,z} = \langle 0,0,0\rangle</math>), then the matrices drop out (as identities), and this reduces to simply a shift: <math>\mathbf{d} = \mathbf{a} - \mathbf{c}.</math>

Alternatively, without using matrices (let us replace <math>a_x - c_x</math> with <math>\mathbf{x}</math> and so on, and abbreviate <math>\cos\left(\theta_\alpha\right)</math> to <math>cos_\alpha</math> and <math>\sin\left(\theta_\alpha\right)</math> to <math>sin_\alpha</math>):
:<math>
\begin{align}
	\mathbf{d}_x & = cos_y (sin_z \mathbf{y}+cos_z \mathbf{x})-sin_y \mathbf{z} \\
	\mathbf{d}_y & = sin_x (cos_y \mathbf{z}+sin_y (sin_z \mathbf{y}+cos_z \mathbf{x}))+cos_x (cos_z \mathbf{y}-sin_z \mathbf{x}) \\
	\mathbf{d}_z & = cos_x (cos_y \mathbf{z}+sin_y (sin_z \mathbf{y}+cos_z \mathbf{x}))-sin_x (cos_z \mathbf{y}-sin_z \mathbf{x})
\end{align}
</math>
This transformed point can then be projected onto the 2D plane using the formula (here, ''x''/''y'' is used as the projection plane; literature also may use ''x''/''z''):<ref>
{{Cite book
  | last1 = Sonka | first1 = M
  | last2 = Hlavac | first2 = V
  | last3 = Boyle | first3 = R
  | title = Image Processing, Analysis & Machine Vision | edition= 2nd
  | publisher = Chapman and Hall
  | year = 1995
  | page = 14
  | isbn = 978-0-412-45570-4
  }}
</ref>
:<math>
\begin{align}
 \mathbf{b}_x &= \frac{\mathbf{e}_z}{\mathbf{d}_z} \mathbf{d}_x + \mathbf{e}_x, \\[5pt]
 \mathbf{b}_y &= \frac{\mathbf{e}_z}{\mathbf{d}_z} \mathbf{d}_y + \mathbf{e}_y.
\end{align}
</math>

Or, in matrix form using [[homogeneous coordinates]], the system
:<math>
\begin{bmatrix}
   \mathbf{f}_x \\
   \mathbf{f}_y \\
   \mathbf{f}_w
\end{bmatrix}=\begin{bmatrix}
   1 & 0 & \frac{\mathbf{e}_x}{\mathbf{e}_z} \\
   0 & 1 & \frac{\mathbf{e}_y}{\mathbf{e}_z} \\
   0 & 0 & \frac{1}{\mathbf{e}_z}
\end{bmatrix}\begin{bmatrix}
   \mathbf{d}_x  \\
   \mathbf{d}_y  \\
   \mathbf{d}_z
\end{bmatrix}
</math>
in conjunction with an argument using similar triangles, leads to division by the homogeneous coordinate, giving
:<math>
\begin{align}
 \mathbf{b}_x &= \mathbf{f}_x / \mathbf{f}_w \\
 \mathbf{b}_y &= \mathbf{f}_y / \mathbf{f}_w
\end{align}
</math>

The distance of the viewer from the display surface, <math>\mathbf{e}_z</math>, directly relates to the field of view, where <math>\alpha=2 \cdot \arctan(1/\mathbf{e}_z)</math> is the viewed angle. (Note: This assumes that you map the points (-1,-1) and (1,1) to the corners of your viewing surface)

The above equations can also be rewritten as:
:<math>
\begin{align}
 \mathbf{b}_x & = (\mathbf{d}_x \mathbf{s}_x ) / (\mathbf{d}_z \mathbf{r}_x) \mathbf{r}_z, \\
 \mathbf{b}_y & = (\mathbf{d}_y \mathbf{s}_y ) / (\mathbf{d}_z \mathbf{r}_y) \mathbf{r}_z.
\end{align}
</math>
In which <math>\mathbf{s}_{x,y}</math> is the display size, <math>\mathbf{r}_{x,y}</math> is the recording surface size ([[Charge-coupled device|CCD]] or [[Photographic film]]), <math>\mathbf{r}_z</math> is the distance from the recording surface to the [[entrance pupil]] ([[pinhole camera model|camera center]]), and <math>\mathbf{d}_z</math> is the distance, from the 3D point being projected, to the entrance pupil.

Subsequent clipping and scaling operations may be necessary to map the 2D plane onto any particular display media.

=== Weak perspective projection ===

A "weak" perspective projection uses the same principles of an orthographic projection, but requires the scaling factor to be specified, thus ensuring that closer objects appear bigger in the projection, and vice versa. It can be seen as a hybrid between an orthographic and a perspective projection, and described either as a perspective projection with individual point depths <math>Z_i</math> replaced by an average constant depth <math>Z_\text{ave}</math>,<ref>{{cite web |url = http://www.cse.iitd.ernet.in/~suban/vision/affine/node5.html |title = The Weak-Perspective Camera |author = Subhashis Banerjee |date = 2002-02-18 }}</ref> or simply as an orthographic projection plus a scaling.<ref>{{cite tech report |first = T. D. |last = Alter |title = 3D Pose from 3 Corresponding Points under Weak-Perspective Projection |url = http://dspace.mit.edu/bitstream/handle/1721.1/6611/AIM-1378.pdf |institution = MIT [[AI Lab]] |date=July 1992 }}</ref>

The weak-perspective model thus approximates perspective projection while using a simpler model, similar to the pure (unscaled) orthographic perspective.
It is a reasonable approximation when the depth of the object along the line of sight is small compared to the distance from the camera, and the field of view is small. With these conditions, it can be assumed that all points on a 3D object are at the same distance <math>Z_\text{ave}</math> from the camera without significant errors in the projection (compared to the full perspective model).

'''''Equation'''''
:<math>\begin{align}
& P_x = \frac X {Z_\text{ave}} \\[5pt]
& P_y = \frac Y {Z_\text{ave}}
\end{align}</math>
assuming focal length <math display="inline">f = 1</math>.