Coordinate spaces and space conversions are an integral part of computer graphics. Knowing the properties of commonly used spaces and when to use them is often enough, as the conversions between these spaces are almost always the same.

Cheat sheet for converting positions between common spaces.

Transforming Coordinates into Other Spaces

Converting between coordinate spaces can be simply treated as a black box, but understanding the conversions can be very helpful in certain scenarios.

Affine Transformations

All conversions are achieved using affine transformations. Affine transformations are automorphisms of an affine space that preserve parallelism and lines. The vector spaces we will be working with are Euclidean spaces, which are specific affine spaces. The fact that this operation is an automorphism of such a space tells us that affine transformations are bijective (reversible) homomorphisms which map the elements of one vector space onto elements of the exact same vector space.

Most importantly, affine transformations can be expressed as combinations of linear transformations and translations. Let's break that down even further.

Translations

Translations are transformations that move all points into a given direction by a given distance. A very simple translation $T : ℝ^{3} \to ℝ^{3}$ is:

$T (\vec{x}) : = \vec{x} + (\begin{matrix} 1 \\ 0 \\ 0 \end{matrix})$

In fact, adding any offset $\vec{v} \in ℝ^{3}$ to $\vec{x}$ qualifies as a translation.

Linear Transformations

A transformation $T : ℝ^{n} \to ℝ^{n}$ is linear if and only if it can be identically expressed as the multiplication of the input vector and an $n \times n$ matrix $A \in ℝ^{n \times n}$ , i.e. $\forall \vec{x} \in ℝ^{n} : T (\vec{x}) = A \vec{x}$ .

Linear transformations are a combination of operations like rotation, reflection, scale, shear, skew, squeeze and projection. However, they do not move the origin of the space to a different location. As such, an important property of all linear transformations $T$ is that $T (\vec{0}) = \vec{0}$ .

Expressing linear transformations with their equivalent matrices is very convenient and used throughout all of computer graphics. It allows us to concatenate multiple linear transformations by simply multiplying their respective matrices.

But what happens if we need to apply an offset $\vec{v}$ in order to transform coordinates from one space to another? This means that we need to apply a translation, or more generally an affine transformation $T (\vec{x}) = A \vec{x} + \vec{v}$ , which unfortunately can't be expressed as a matrix.

Well actually, it can be by using some tricks! We just need to increase the number of dimensions we are working in by one and work with homogeneous coordinates.

Homogeneous Coordinates

Homogeneous coordinates are a fairly complex topic, but they allow us to take advantage of the following property: an affine transformation in $ℝ^{n}$ can be expressed as a linear transformation in $ℝ^{n + 1}$ , meaning that we can write it as a matrix.

Let's assume that $n = 3$ . We want to write the affine transformation $T : ℝ^{3} \to ℝ^{3}$ with $T (\vec{x}) = A \vec{x} + \vec{v}$ using the matrix $H \in ℝ^{4 \times 4}$ .

$H : = (\begin{matrix} A_{11} & A_{12} & A_{13} & v_{x} \\ A_{21} & A_{22} & A_{23} & v_{y} \\ A_{31} & A_{32} & A_{33} & v_{z} \\ 0 & 0 & 0 & 1 \end{matrix})$

$H$ is constructed by simply adding a fourth row and column to $A$ . Specifically, we add the translation/offset vector $\vec{v}$ as the fourth column, and then append the fourth row of the identity matrix. Our input now also needs to be four-dimensional - the fourth component acts as a scalar for the translation. If we set it to 0, the translation will not be applied. That is why we need to set it to 1 so that the offset remains unchanged.

By simply following the rules of matrix multiplication, we can see that the first three components of the result are exactly what we wanted to calculate, namely the output of our affine transformation. The fourth component is still set to 1.

$H \cdot (\begin{matrix} x_{x} \\ x_{y} \\ x_{z} \\ 1 \end{matrix}) = (\begin{matrix} A_{11} & A_{12} & A_{13} & v_{x} \\ A_{21} & A_{22} & A_{23} & v_{y} \\ A_{31} & A_{32} & A_{33} & v_{z} \\ 0 & 0 & 0 & 1 \end{matrix}) \cdot (\begin{matrix} x_{x} \\ x_{y} \\ x_{z} \\ 1 \end{matrix}) = (\begin{matrix} A_{11} x_{x} + A_{12} x_{y} + A_{13} x_{z} + v_{x} \\ A_{21} x_{x} + A_{22} x_{y} + A_{23} x_{z} + v_{y} \\ A_{31} x_{x} + A_{32} x_{y} + A_{33} x_{z} + v_{z} \\ 1 \end{matrix}) = (\begin{matrix} A \vec{x} + \vec{v} \\ 1 \end{matrix}) = (\begin{matrix} T (\vec{x}) \\ 1 \end{matrix})$

A big advantage of using homogeneous coordinates and matrices for expressing affine transformations is that - because matrix multiplication is associative - we can easily concatenate multiple transformations by multiplying their matrices, resulting in a single matrix. Without using homogeneous coordinates, this can only be done with linear transformations, as already mentioned above.

For example, let's assume that we want to transform the coordinates of a vertex from model space to clip space. The model space to player space transformation contains the offset of the chunk from the world origin as a translation, meaning that it is not linear. Thanks to homogeneous coordinates, we can still express it as a matrix, and combine the model, view and projection matrix into a single matrix that only has to be calculated once for every chunk, as demonstrated in this pseudo-code:

projectionMatrix * (viewMatrix * (modelMatrix * vertexPosition))
(projectionMatrix * viewMatrix * modelMatrix) * vertexPosition
modelViewProjectionMatrix * vertexPosition

Coordinate Spaces

Transforming Coordinates into Other Spaces

Affine Transformations

Translations

Linear Transformations

Homogeneous Coordinates

Interactive Visualizer