Editing Graphics pipeline (section)

=== Geometry ===
The geometry step (with [[Geometry pipeline]]), which is responsible for the majority of the operations with [[polygon]]s and their [[vertex (computer graphics)|vertices]] (with [[Vertex pipeline]]), can be divided into the following five tasks. It depends on the particular implementation of how these tasks are organized as actual parallel pipeline steps.

[[File:Geometry pipeline en.svg|550px]]

==== Definitions ====
A ''vertex'' (plural: vertices) is a point in the world. Many points are used to join the surfaces. In special cases, [[point cloud]]s are drawn directly, but this is still the exception.

A ''[[triangle primitives|triangle]]'' is the most common geometric primitive of computer graphics. It is defined by its three vertices and a [[normal vector]] – the normal vector serves to indicate the front face of the triangle and is a vector that is perpendicular to the surface. The triangle may be provided with a color or with a [[texture (computer graphics)|texture]] (image "glued" on top of it). Triangles are preferred over rectangles because their three points always exist in a single [[plane (geometry)|plane]].

==== The World Coordinate System ====
The world [[3d world coordinate system|coordinate system]] is the coordinate system in which the virtual world is created. This should meet a few conditions for the following mathematics to be easily applicable: 
* It must be a rectangular Cartesian coordinate system in which all axes are equally scaled. 
How the unit of the coordinate system is defined, is left to the developer. Whether, therefore, the unit vector of the system corresponds in reality to one meter or an [[Ångström]] depends on the application. 
* Whether a [[Left-handed coordinate system|right-handed or a left-handed]] coordinate system is to be used may be determined by the graphic library to be used.

: ''Example:'' If we are to develop a flight simulator, we can choose the world coordinate system so that the origin is in the middle of the Earth and the unit is set to one meter. In addition, to make the reference to reality easier, we define that the X axis should intersect the equator on the zero meridian, and the Z axis passes through the poles. In a Right-handed system, the Y-axis runs through the 90°-East meridian (somewhere in the [[Indian Ocean]]). Now we have a coordinate system that describes every point on Earth in three-dimensional [[Cartesian coordinates]]. In this coordinate system, we are now modeling the principles of our world, mountains, valleys, and oceans.

: ''Note:'' Aside from computer geometry, [[geographic coordinates]] are used for the [[Earth]], i.e., [[latitude]] and [[longitude]], as well as altitudes above sea level. The approximate conversion – if one does not consider the fact that the Earth is not an exact sphere – is simple:
: <math>\begin{pmatrix}
x\\
y\\
z
\end{pmatrix}=\begin{pmatrix}
(R+{hasl})*\cos({lat})*\cos({long})\\
(R+{hasl})*\cos({lat})*\sin({long})\\
(R+{hasl})*\sin({lat})
\end{pmatrix}
</math> with R = Radius of the Earth [6.378.137m], lat = Latitude, long = Longitude, hasl = height above sea level.
: All of the following examples apply in a right-handed system. For a left-handed system, the signs may need to be interchanged.

The objects contained within the scene (houses, trees, cars) are often designed in their object coordinate system (also called model coordinate system or local coordinate system) for reasons of simpler modeling. To assign these objects to coordinates in the world coordinate system or global coordinate system of the entire scene, the object coordinates are transformed using translation, rotation, or scaling. This is done by multiplying the corresponding [[transformation matrix|transformation matrices]]. In addition, several differently transformed copies can be formed from one object, for example, a forest from a tree; This technique is called instancing.
: To place a model of an aircraft in the world, we first determine four matrices. Since we work in three-dimensional space, we need four-dimensional [[Homogeneous coordinates|homogeneous matrices]] for our calculations. 
First, we need three [[rotation matrix|rotation matrices]], namely one for each of the three aircraft axes (vertical axis, transverse axis, longitudinal axis).

: Around the X axis (usually defined as a longitudinal axis in the object coordinate system)
<math>R_x=\begin{pmatrix}
1 & 0 & 0 & 0\\
0 & \cos(\alpha) & \sin(\alpha) & 0\\
0 & -\sin(\alpha) & \cos(\alpha) & 0\\
0 & 0 & 0 & 1
\end{pmatrix}</math>

: Around the Y axis (usually defined as the transverse axis in the object coordinate system)
<math>R_y=\begin{pmatrix}
\cos(\alpha) & 0 & -\sin(\alpha) & 0\\
0 & 1 & 0 & 0\\
\sin(\alpha) & 0 & \cos(\alpha) & 0\\
0 & 0 & 0 & 1
\end{pmatrix}</math>

: Around the Z axis (usually defined as a vertical axis in the object coordinate system)
<math>R_z=\begin{pmatrix}
\cos(\alpha) & \sin(\alpha) & 0 & 0\\
-\sin(\alpha) & \cos(\alpha) & 0 & 0\\
0 & 0 & 1 & 0\\
0 & 0 & 0 & 1
\end{pmatrix}</math>

We also use a translation matrix that moves the aircraft to the desired point in our world:
<math>T_{x,y,z}=\begin{pmatrix}
1 & 0 & 0 & 0\\
0 & 1 & 0 & 0\\
0 & 0 & 1 & 0\\
x & y & z & 1
\end{pmatrix}</math>.
: ''Remark'': The above matrices are [[matrix transpose|transposed]] with respect to the ones in the article [[rotation matrix]]. See further down for an explanation of why.

Now we could calculate the position of the vertices of the aircraft in world coordinates by multiplying each point successively with these four matrices. Since the [[matrix multiplication|multiplication of a matrix with a vector]] is quite expensive (time-consuming), one usually takes another path and first multiplies the four matrices together. The multiplication of two matrices is even more expensive but must be executed only once for the whole object.&nbsp;The multiplications <math>((((v*R_x)*R_y)*R_z)*T)</math> and <math>(v*(((R_x*R_y)*R_z)*T))</math> are equivalent. Thereafter, the resulting matrix could be applied to the vertices. In practice, however, the multiplication with the vertices is still not applied, but the camera matrices (see below) are determined first.

: For our example from above, however, the translation has to be determined somewhat differently, since the common meaning of ''[[Body relative direction|up]]'' – apart from at the North Pole – does not coincide with our definition of the positive Z axis and therefore the model must also be rotated around the center of the Earth: <math>T_{Kugel} = T_{x,y,z}(0,0, R+{hasl})*R_y(\Pi/2-{lat})*R_z({long})</math> The first step pushes the origin of the model to the correct height above the Earth's surface, then it is rotated by latitude and longitude.

The order in which the matrices are applied is important because the [[matrix multiplication]] is ''not'' [[commutative property|commutative]]. This also applies to the three rotations, which can be demonstrated by an example: The point (1, 0, 0) lies on the X-axis, if one rotates it first by 90° around the X- and then around The Y-axis, it ends up on the Z-axis (the rotation around the X-axis does not affect a point that is on the axis). If, on the other hand, one rotates around the Y-axis first and then around the X-axis, the resulting point is located on the Y-axis. The sequence itself is arbitrary as long as it is always the same. The sequence with x, then y, then z (roll, pitch, heading) is often the most intuitive because the rotation causes the compass direction to coincide with the direction of the "nose".

There are also two conventions to define these matrices, depending on whether you want to work with column vectors or row vectors. Different graphics libraries have different preferences here. [[OpenGL]] prefers column vectors, [[DirectX]] row vectors. The decision determines from which side the point vectors are to be multiplied by the transformation matrices. 
For column vectors, the multiplication is performed from the right, i.e. <math>v_{out} = M * v_{in}</math>, where v<sub>out</sub> and v<sub>in</sub> are 4x1 column vectors. The concatenation of the matrices also is done from the right to left, i.e., for example <math>M = T_x * R_x</math>, when first rotating and then shifting.

In the case of row vectors, this works exactly the other way around. The multiplication now takes place from the left as <math>v_{out} = v_{in} * M</math> with 1x4-row vectors and the concatenation is <math>M = R_x * T_x</math> when we also first rotate and then move. The matrices shown above are valid for the second case, while those for column vectors are transposed. The rule <math>(v*M)^{T} = M^{T}*v^{T}</math><ref>{{cite book |first1=K. |last1=Nipp |first2=D. |last2=Stoffer |title=Lineare Algebra |publisher=v/d/f Hochschulverlag der ETH Zürich |date=1998 |isbn=3-7281-2649-7}}</ref> applies, which for multiplication with vectors means that you can switch the multiplication order by transposing the matrix.

In matrix chaining, each transformation defines a new coordinate system, allowing for flexible extensions. For instance, an aircraft's propeller, modeled separately, can be attached to the aircraft nose through translation, which only shifts from the model to the propeller coordinate system. To render the aircraft, its transformation matrix is first computed to transform the points, followed by multiplying the propeller model matrix by the aircraft's matrix for the propeller points. This calculated matrix is known as the 'world matrix,' essential for each object in the scene before rendering. The application can then dynamically alter these matrices, such as updating the aircraft's position with each frame based on speed.

The matrix calculated in this way is also called the ''world matrix''. It must be determined for each object in the world before rendering. The application can introduce changes here, for example, changing the position of the aircraft according to the speed after each frame.

==== Camera Transformation ====
[[File:View transform.svg|thumb|upright=1.5|Left: Position and direction of the virtual viewer (camera), as defined by the user. Right: Positioning the objects after the camera transformation. The light gray area is the visible volume.]]

In addition to the objects, the scene also defines a virtual camera or viewer that indicates the position and direction of view relative to which the scene is rendered. The scene is transformed so that the camera is at the origin looking along the Z-axis. The resulting coordinate system is called the camera coordinate system and the transformation is called ''camera transformation'' or ''View Transformation''.

: The view matrix is usually determined from the camera position, target point (where the camera looks), and an "up vector" ("up" from the viewer's viewpoint). The first three auxiliary vectors are required:
:{{Code|1=Zaxis = normal(cameraPosition – cameraTarget)}}
:{{Code|1=Xaxis = normal(cross(cameraUpVector, Zaxis))}}
:{{Code|1=Yaxis = cross(Zaxis, Xaxis )}}
: With normal(v) = normalization of the vector v; 
: cross(v1, v2) = [[cross product]] of v1 and v2.

:Finally, the matrix: <math>\begin{pmatrix}
{xaxis}.x & {yaxis}.x & {zaxis}.x & 0\\
{xaxis}.y & {yaxis}.y & {zaxis}.y & 0\\
{xaxis}.z & {yaxis}.z & {zaxis}.z & 0\\
-{dot}({xaxis}, {cameraPosition}) & -{dot}({yaxis},{cameraPosition}) & -{dot}({zaxis},{cameraPosition}) & 1
\end{pmatrix}</math>
: with dot(v1, v2) = [[dot product]] of v1 and v2.

==== Projection ====
The [[3D projection]] step transforms the view volume into a cube with the corner point coordinates (−1, −1, 0) and (1, 1, 1); Occasionally other target volumes are also used. This step is called ''projection'', even though it transforms a volume into another volume, since the resulting Z coordinates are not stored in the image, but are only used in [[Z-buffering]] in the later rastering step. In a [[perspective (visual)|perspective illustration]], a [[central projection]] is used. To limit the number of displayed objects, two additional clipping planes are used; The visual volume is therefore a truncated pyramid ([[frustum]]). The parallel or [[orthogonal projection]] is used, for example, for technical representations because it has the advantage that all parallels in the object space are also parallel in the image space, and the surfaces and volumes are the same size regardless of the distance from the viewer. Maps use, for example, an orthogonal projection (so-called [[orthophoto]]), but oblique images of a landscape cannot be used in this way – although they can technically be rendered, they seem so distorted that we cannot make any use of them. The formula for calculating a perspective mapping matrix is:

<math>\begin{pmatrix}
w & 0 & 0 & 0\\
0 & h & 0 & 0\\
0 & 0 & {far}/({near-far}) & -1\\
0 & 0 & ({near}*{far}) / ({near}-{far}) & 0
\end{pmatrix}</math>
: With h = cot (fieldOfView / 2.0) (aperture angle of the camera);&nbsp;w = h / aspect Ratio (aspect ratio of the target image);&nbsp; near = Smallest distance to be visible;&nbsp; far = The longest distance to be visible.

The reasons why the smallest and the greatest distance have to be given here are, on the one hand, that this distance is divided to reach the scaling of the scene (more distant objects are smaller in a perspective image than near objects), and on the other hand to scale the Z values to the range 0..1, for filling the [[Z-buffer]]. This buffer often has only a resolution of 16 bits, which is why the near and far values should be chosen carefully. A too-large difference between the near and the far value leads to so-called [[Z-fighting]] because of the low resolution of the Z-buffer. It can also be seen from the formula that the near value cannot be 0 because this point is the focus point of the projection. There is no picture at this point.

For the sake of completeness, the formula for parallel projection (orthogonal projection):

<math>\begin{pmatrix}
2.0/w & 0 & 0 & 0\\
0 & 2.0/h & 0 & 0\\
0 & 0 & 1.0/({near-far}) & -1\\
0 & 0 & {near} / ({near}-{far}) & 0
\end{pmatrix}</math>
: with w = width of the target cube (dimension in units of the world coordinate system); H = w / aspect Ratio (aspect ratio of the target image); near = Smallest distance to be visible; far = longest distance to be visible.

For reasons of efficiency, the camera and projection matrix are usually combined into a transformation matrix so that the camera coordinate system is omitted. The resulting matrix is usually the same for a single image, while the world matrix looks different for each object. In practice, therefore, view and projection are pre-calculated so that only the world matrix has to be adapted during the display. However, more complex transformations such as vertex blending are possible. Freely programmable [[geometry shader]]s that modify the geometry can also be executed.

In the actual rendering step, the world matrix * camera matrix * projection matrix is calculated and then finally applied to every single point. Thus, the points of all objects are transferred directly to the screen coordinate system (at least almost, the value range of the axes is still −1..1 for the visible range, see section "Window-Viewport-Transformation").

==== Lighting ====
Often a scene contains light sources placed at different positions to make the lighting of the objects appear more realistic. In this case, a gain factor for the texture is calculated for each vertex based on the light sources and the material properties associated with the corresponding triangle. In the later rasterization step, the vertex values of a triangle are interpolated over its surface. A general lighting (ambient light) is applied to all surfaces. It is the diffuse and thus direction-independent brightness of the scene. The sun is a directed light source, which can be assumed to be infinitely far away. The illumination effected by the sun on a surface is determined by forming the scalar product of the directional vector from the sun and the normal vector of the surface. If the value is negative, the surface is facing the sun.

==== Clipping ====
[[File:Cube clipping.svg|thumb|Clipping of primitives against the cube. The blue triangle is discarded while the orange triangle is clipped, creating two new vertices.]]

[[File:Frustum.png|thumb|Frustum]]
Only the primitives that are within the visual volume need to be [[rasterization|rastered]] (drawn). This visual volume is defined as the inside of a [[frustum]], a shape in the form of a pyramid with a cut-off top. Primitives that are completely outside the visual volume are discarded; This is called [[frustum culling]]. Further culling methods such as back-face culling, which reduces the number of primitives to be considered, can theoretically be executed in any step of the graphics pipeline. Primitives that are only partially inside the cube must be [[clipping (computer graphics)|clipped]] against the cube. The advantage of the previous projection step is that the clipping always takes place against the same cube. Only the – possibly clipped – primitives, which are within the visual volume, are forwarded to the final step.

==== Window-Viewport transformation ====
[[File:Screen Mapping.svg|thumb|Window-Viewport-Transformation]]
To output the image to any target area (viewport) of the screen, another transformation, the ''Window-Viewport transformation'', must be applied. This is a shift, followed by scaling. The resulting coordinates are the device coordinates of the output device. The viewport contains 6 values: the height and width of the window in pixels, the upper left corner of the window in window coordinates (usually 0, 0), and the minimum and maximum values for Z (usually 0 and 1).

: Formally: <math>\begin{pmatrix}
x\\
y\\
z
\end{pmatrix}=\begin{pmatrix}
{vp}.X+(1.0+v.X)*{vp}.{width}/2.0\\
{vp}.Y+(1.0-v.Y)*{vp}.{height}/2.0\\
{vp}.{minz}+v.Z*({vp}.{maxz} - {vp}.{minz})
\end{pmatrix}</math>
: With vp=Viewport; v=Point after projection

On modern hardware, most of the geometry computation steps are performed in the [[vertex shader]]. This is, in principle, freely programmable, but generally performs at least the transformation of the points and the illumination calculation. For the DirectX programming interface, the use of a custom vertex shader is necessary from version 10, while older versions still have a standard shader.