Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
3D projection
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===Mathematical formula=== The perspective projection requires a more involved definition as compared to orthographic projections. A conceptual aid to understanding the mechanics of this projection is to imagine the 2D projection as though the object(s) are being viewed through a camera viewfinder. The camera's position, orientation, and [[field of view]] control the behavior of the projection transformation. The following variables are defined to describe this transformation: * <math>\mathbf{a}_{x,y,z}</math> – the 3D position of a point ''A'' that is to be projected * <math>\mathbf{c}_{x,y,z}</math> – the 3D position of a point ''C'' representing the camera * <math>\mathbf{\theta}_{x,y,z}</math> – The [[Orientation (geometry)|orientation]] of the camera (represented by [[Tait-Bryan angles|Tait–Bryan angles]]) * <math>\mathbf{e}_{x,y,z}</math> – the [[Image plane|display surface]]'s position relative to aforementioned <math>\mathbf{c}</math><ref>{{Cite journal |author=Ingrid Carlbom, Joseph Paciorek |title=Planar Geometric Projections and Viewing Transformations |journal=[[ACM Computing Surveys]] |volume=10 |issue=4 |pages=465–502 |year=1978 |doi=10.1145/356744.356750 |url=http://www.cs.uns.edu.ar/cg/clasespdf/p465carlbom.pdf |citeseerx=10.1.1.532.4774 |s2cid=708008 }}</ref> Most conventions use positive z values (the plane being in front of the pinhole <math>\mathbf{c}</math>), however negative z values are physically more correct, but the image will be inverted both horizontally and vertically. Which results in: * <math>\mathbf{b}_{x,y}</math> – the 2D projection of <math>\mathbf{a}.</math> When <math>\mathbf{c}_{x,y,z}=\langle 0,0,0\rangle,</math> and <math>\mathbf{\theta}_{x,y,z} = \langle 0,0,0\rangle,</math> the 3D vector <math>\langle 1,2,0 \rangle</math> is projected to the 2D vector <math>\langle 1,2 \rangle</math>. Otherwise, to compute <math>\mathbf{b}_{x,y}</math> we first define a vector <math>\mathbf{d}_{x,y,z}</math> as the position of point ''A'' with respect to a [[coordinate system]] defined by the camera, with origin in ''C'' and rotated by <math>\mathbf{\theta}</math> with respect to the initial coordinate system. This is achieved by [[Matrix addition|subtracting]] <math>\mathbf{c}</math> from <math>\mathbf{a}</math> and then applying a rotation by <math>-\mathbf{\theta}</math> to the result. This transformation is often called a '''{{visible anchor|camera transform}}''', and can be expressed as follows, expressing the rotation in terms of rotations about the ''x,'' ''y,'' and ''z'' axes (these calculations assume that the axes are ordered as a [[Cartesian coordinates#Orientation and handedness|left-handed]] system of axes): <!--Orthogonal transformation (pg 931) and Matrix and Adaptation for n-dimensional arbitrary rotation (pg 942):--><ref>{{cite book | last = Riley | first = K F | title = Mathematical Methods for Physics and Engineering | url = https://archive.org/details/mathematicalmeth00rile | url-access = registration | year = 2006 | publisher = [[Cambridge University Press]] | pages = [https://archive.org/details/mathematicalmeth00rile/page/931 931], 942 | isbn = 978-0-521-67971-8 }} </ref> <!--Related form, using rotation around intermediate axes--><ref>{{cite book | last = Goldstein| first = Herbert | title = Classical Mechanics | edition= 2nd | year = 1980 | pages = 146–148 | isbn = 978-0-201-02918-5 | publisher = Addison-Wesley Pub. Co. | location = Reading, Mass. }} </ref> :<math> \begin{bmatrix} \mathbf{d}_x \\ \mathbf{d}_y \\ \mathbf{d}_z \end{bmatrix}=\begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos ( \mathbf{\theta}_x ) & \sin ( \mathbf{ \theta}_x ) \\ 0 & -\sin ( \mathbf{ \theta}_x ) & \cos ( \mathbf{ \theta}_x ) \end{bmatrix}\begin{bmatrix} \cos ( \mathbf{\theta}_y ) & 0 & - \sin ( \mathbf{ \theta}_y ) \\ 0 & 1 & 0 \\ \sin ( \mathbf{\theta}_y ) & 0 & \cos ( \mathbf{\theta}_y ) \end{bmatrix}\begin{bmatrix} \cos ( \mathbf{\theta}_z ) & \sin ( \mathbf{\theta}_z ) & 0 \\ -\sin ( \mathbf{\theta}_z ) & \cos ( \mathbf{\theta}_z ) & 0 \\ 0 & 0 & 1 \end{bmatrix}\left( {\begin{bmatrix} \mathbf{a}_x \\ \mathbf{a}_y \\ \mathbf{a}_z \\ \end{bmatrix} - \begin{bmatrix} \mathbf{c}_x \\ \mathbf{c}_y \\ \mathbf{c}_z \\ \end{bmatrix}} \right) </math> This representation corresponds to rotating by three [[Euler angles]] (more properly, [[Tait-Bryan angles|Tait–Bryan angles]]), using the ''xyz'' convention, which can be interpreted either as "rotate about the ''extrinsic'' axes (axes of the ''scene'') in the order ''z'', ''y'', ''x'' (reading right-to-left)" or "rotate about the ''intrinsic'' axes (axes of the ''camera'') in the order ''x, y, z'' (reading left-to-right)". If the camera is not rotated (<math>\mathbf{\theta}_{x,y,z} = \langle 0,0,0\rangle</math>), then the matrices drop out (as identities), and this reduces to simply a shift: <math>\mathbf{d} = \mathbf{a} - \mathbf{c}.</math> Alternatively, without using matrices (let us replace <math>a_x - c_x</math> with <math>\mathbf{x}</math> and so on, and abbreviate <math>\cos\left(\theta_\alpha\right)</math> to <math>cos_\alpha</math> and <math>\sin\left(\theta_\alpha\right)</math> to <math>sin_\alpha</math>): :<math> \begin{align} \mathbf{d}_x & = cos_y (sin_z \mathbf{y}+cos_z \mathbf{x})-sin_y \mathbf{z} \\ \mathbf{d}_y & = sin_x (cos_y \mathbf{z}+sin_y (sin_z \mathbf{y}+cos_z \mathbf{x}))+cos_x (cos_z \mathbf{y}-sin_z \mathbf{x}) \\ \mathbf{d}_z & = cos_x (cos_y \mathbf{z}+sin_y (sin_z \mathbf{y}+cos_z \mathbf{x}))-sin_x (cos_z \mathbf{y}-sin_z \mathbf{x}) \end{align} </math> This transformed point can then be projected onto the 2D plane using the formula (here, ''x''/''y'' is used as the projection plane; literature also may use ''x''/''z''):<ref> {{Cite book | last1 = Sonka | first1 = M | last2 = Hlavac | first2 = V | last3 = Boyle | first3 = R | title = Image Processing, Analysis & Machine Vision | edition= 2nd | publisher = Chapman and Hall | year = 1995 | page = 14 | isbn = 978-0-412-45570-4 }} </ref> :<math> \begin{align} \mathbf{b}_x &= \frac{\mathbf{e}_z}{\mathbf{d}_z} \mathbf{d}_x + \mathbf{e}_x, \\[5pt] \mathbf{b}_y &= \frac{\mathbf{e}_z}{\mathbf{d}_z} \mathbf{d}_y + \mathbf{e}_y. \end{align} </math> Or, in matrix form using [[homogeneous coordinates]], the system :<math> \begin{bmatrix} \mathbf{f}_x \\ \mathbf{f}_y \\ \mathbf{f}_w \end{bmatrix}=\begin{bmatrix} 1 & 0 & \frac{\mathbf{e}_x}{\mathbf{e}_z} \\ 0 & 1 & \frac{\mathbf{e}_y}{\mathbf{e}_z} \\ 0 & 0 & \frac{1}{\mathbf{e}_z} \end{bmatrix}\begin{bmatrix} \mathbf{d}_x \\ \mathbf{d}_y \\ \mathbf{d}_z \end{bmatrix} </math> in conjunction with an argument using similar triangles, leads to division by the homogeneous coordinate, giving :<math> \begin{align} \mathbf{b}_x &= \mathbf{f}_x / \mathbf{f}_w \\ \mathbf{b}_y &= \mathbf{f}_y / \mathbf{f}_w \end{align} </math> The distance of the viewer from the display surface, <math>\mathbf{e}_z</math>, directly relates to the field of view, where <math>\alpha=2 \cdot \arctan(1/\mathbf{e}_z)</math> is the viewed angle. (Note: This assumes that you map the points (-1,-1) and (1,1) to the corners of your viewing surface) The above equations can also be rewritten as: :<math> \begin{align} \mathbf{b}_x & = (\mathbf{d}_x \mathbf{s}_x ) / (\mathbf{d}_z \mathbf{r}_x) \mathbf{r}_z, \\ \mathbf{b}_y & = (\mathbf{d}_y \mathbf{s}_y ) / (\mathbf{d}_z \mathbf{r}_y) \mathbf{r}_z. \end{align} </math> In which <math>\mathbf{s}_{x,y}</math> is the display size, <math>\mathbf{r}_{x,y}</math> is the recording surface size ([[Charge-coupled device|CCD]] or [[Photographic film]]), <math>\mathbf{r}_z</math> is the distance from the recording surface to the [[entrance pupil]] ([[pinhole camera model|camera center]]), and <math>\mathbf{d}_z</math> is the distance, from the 3D point being projected, to the entrance pupil. Subsequent clipping and scaling operations may be necessary to map the 2D plane onto any particular display media.
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)