Translator Disclaimer
27 July 2017 Operator-based homogeneous coordinates: application in camera document scanning
Abstract
An operator-based approach for the study of homogeneous coordinates and projective geometry is proposed. First, some basic geometrical concepts and properties of the operators are investigated in the one- and two-dimensional cases. Then, the pinhole camera model is derived, and a simple method for homography estimation and camera calibration is explained. The usefulness of the analyzed theoretical framework is exemplified by addressing the perspective correction problem for a camera document scanning application. Several experimental results are provided for illustrative purposes. The proposed approach is expected to provide practical insights for inexperienced students on camera calibration, computer vision, and optical metrology among others.

## Introduction

Projective geometry is an important topic in computer vision because it provides a useful camera imaging model and its fundamental properties.1 Some applications of this topic are found in camera motion,2 camera calibration,3,4 pose estimation for augmented reality,5 perspective correction,6 and three-dimensional (3-D) surface imaging7 among others.

Theoretical concepts of projective geometry are analyzed simply and elegantly using homogeneous coordinates.8,9 However, projective geometry is commonly presented in abstract form, leaving a gap in how to apply it in computer vision problems.10 Moreover, homogeneous coordinates are used with a notation that masks basic geometrical aspects and may confuse the inexperienced readers.11

In this paper, a simple and intuitive approach to expose some useful concepts of projective geometry is addressed. For this, an alternative notation for homogeneous coordinates based on operators is suggested. To highlight the relevance of this topic in computer vision, the presentation is motivated by a specific problem, namely the perspective correction for a “camera scanner” application.

First, the proposed operators for homogeneous coordinates are defined in Sec. 2. Next, some basic concepts of projective geometry in the one- (1-D) and two-dimensional (2-D) cases are presented in Secs. 3 and 4, respectively. Then, the pinhole camera model is derived in Sec. 5. A perspective correction method, useful for camera document scanning, is described in Sec. 6. Finally, the conclusions of this work are given in Sec. 7. The paper is complemented with two appendices. Appendix A presents the direct linear transformation method for homography matrix estimation. Finally, a simple method to obtain the camera parameters from homographies is explained in Appendix B.

## 2.1.

### Operators $\mathcal{H}$ and $\mathcal{S}$

A point in an $n$-dimensional space will be represented by a vector of the form

## Eq. (1)

$\mathbf{x}={\left[\begin{array}{cccc}{x}_{1}& {x}_{2}& \cdots & {x}_{n}\end{array}\right]}^{T},$
where ${\left[·\right]}^{T}$ denotes the transpose. The homogeneous coordinates of the point are obtained by adding an extra entry to $\mathbf{x}$ with a value equal to the unity. The result is the $\left(n+1\right)$-dimensional vector

## Eq. (2)

$\mathcal{H}\left[\mathbf{x}\right]=\left[\begin{array}{c}\mathbf{x}\\ 1\end{array}\right],$
where $\mathcal{H}$ will be referred to as the homogeneous operator.

The last entry of a homogeneous vector is known as the scale and will be recovered by the scale operator $\mathcal{S}$. This operator returns the last entry of any given vector. For instance, for the vectors in Eqs. (1) and (2), we have

## Eq. (3)

${x}_{n}=\mathcal{S}\left[\mathbf{x}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{and}\phantom{\rule[-0.0ex]{1em}{0.0ex}}1=\mathcal{S}\left[\mathcal{H}\left[\mathbf{x}\right]\right].$

The operator $\mathcal{H}$ sets the scale to unity. Another operator that sets the scale to zero is needed. For this, we define the operator

## Eq. (4)

${\mathcal{H}}_{0}\left[\mathbf{x}\right]=\left[\begin{array}{c}\mathbf{x}\\ 0\end{array}\right].$
Note that the operator ${\mathcal{H}}_{0}$ does not affect neither the direction nor the norm of $\mathbf{x}$. In projective geometry, the points represented by homogeneous coordinates of the form

## Eq. (5)

${\mathcal{H}}_{0}\left[\mathbf{x}\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}\mathbf{x}\ne {\mathbf{0}}_{n},$
are known as ideal points, where ${\mathbf{0}}_{n}={\left[0,\cdots ,0\right]}^{T}$ is the $n$-dimensional zero vector.

The operators $\mathcal{H}$ and ${\mathcal{H}}_{0}$ can be considered as two particular cases of a more general operator defined as

## Eq. (6)

${\mathcal{H}}_{s}\left[\mathbf{x}\right]=\left[\begin{array}{c}\mathbf{x}\\ s\end{array}\right],$
where $s$ is any scalar.

The procedure of adding an extra entry to vectors is reverted by returning the given vector except its last entry. For this, we define the inverse operator ${\mathcal{H}}_{0}^{-1}$ as follows. For any $\left(n+1\right)$-dimensional vector

## Eq. (7)

$\mathbf{y}={\left[\begin{array}{ccccc}{y}_{1}& {y}_{2}& \cdots & {y}_{n}& {y}_{n+1}\end{array}\right]}^{T},$
the operator ${\mathcal{H}}_{0}^{-1}$ is defined as

## Eq. (8)

${\mathcal{H}}_{0}^{-1}\left[\mathbf{y}\right]={\left[\begin{array}{cccc}{y}_{1}& {y}_{2}& \cdots & {y}_{n}\end{array}\right]}^{T}.$
Based on the operator ${\mathcal{H}}_{0}^{-1}$, the inverse of the operator ${\mathcal{H}}_{s}$ for $s\ne 0$ is defined as

## Eq. (9)

${\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]=\frac{s}{\mathcal{S}\left[\mathbf{y}\right]}{\mathcal{H}}_{0}^{-1}\left[\mathbf{y}\right].$
In particular, the operator ${\mathcal{H}}_{1}^{-1}$ (written simply as ${\mathcal{H}}^{-1}$) will be referred to as the inverse homogeneous operator.

The inverse ${\mathcal{H}}_{0}^{-1}$ is a linear operator. That is, for any two scalars ${\lambda }_{1}$ and ${\lambda }_{2}$, we have

## Eq. (10)

${\mathcal{H}}_{0}^{-1}\left[{\lambda }_{1}{\mathbf{y}}_{1}+{\lambda }_{2}{\mathbf{y}}_{2}\right]={\lambda }_{1}{\mathcal{H}}_{0}^{-1}\left[{\mathbf{y}}_{1}\right]+{\lambda }_{2}{\mathcal{H}}_{0}^{-1}\left[{\mathbf{y}}_{2}\right].$
On the other hand, the operator ${\mathcal{H}}_{s}^{-1}$ is invariant to nonzero scalar multiplication of its argument. That is,

## Eq. (11)

${\mathcal{H}}_{s}^{-1}\left[\lambda \mathbf{y}\right]={\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}s,\lambda \ne 0.$
The operators ${\mathcal{H}}_{s}$ and ${\mathcal{H}}_{s}^{-1}$ can be expressed in terms of the homogeneous operator and its inverse, namely

## Eq. (12)

${\mathcal{H}}_{s}\left[\mathbf{x}\right]={\mathrm{\Xi }}_{s}\mathcal{H}\left[\mathbf{x}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{and}\phantom{\rule{0ex}{0ex}}{\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]={\mathcal{H}}^{-1}\left[{\mathrm{\Xi }}_{s}^{-1}\mathbf{y}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}s\ne 0,$
where

## Eq. (13)

${\mathrm{\Xi }}_{s}=\left[\begin{array}{cc}{\mathbb{I}}_{n}& {\mathbf{0}}_{n}\\ {\mathbf{0}}_{n}^{T}& s\end{array}\right],$
and ${\mathbb{I}}_{n}$ being the $n×n$ identity matrix.

## 2.2.

### Projection Operator $\mathcal{P}$

In general terms, the homogeneous operator carries the representation of a point from $n$- to $\left(n+1\right)$-dimensional vectors while the inverse homogeneous operator returns the representation from $\left(n+1\right)$- to $n$-dimensional vectors. An important transformation emerges when, in the $\left(n+1\right)$-dimensional space, a linear mapping is applied. Mathematically, we describe this transformation by the projection operator defined as

## Eq. (14)

${\mathcal{P}}_{M}\left[\mathbf{x}\right]={\mathcal{H}}^{-1}\left[M\mathcal{H}\left[\mathbf{x}\right]\right],$
where $M$ is an $\left(n+1\right)×\left(n+1\right)$ matrix. A generalized version of the projection operator is obtained using ${\mathcal{H}}_{s}$ and its inverse as

## Eq. (15)

${\mathcal{P}}_{M,s}\left[\mathbf{x}\right]={\mathcal{H}}_{s}^{-1}\left[M{\mathcal{H}}_{s}\left[\mathbf{x}\right]\right].$
From Eq. (11), it follows that, for $s\ne 0$, the operator ${\mathcal{P}}_{M,s}$ is invariant to nonzero scalar multiplication of the matrix $M$; that is

## Eq. (16)

${\mathcal{P}}_{\lambda M,s}\left[\mathbf{x}\right]={\mathcal{P}}_{M,s}\left[\mathbf{x}\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}s,\lambda \ne 0.$
Let $\mathbf{b}={\mathcal{P}}_{M,s}\left[\mathbf{a}\right]$ with $M$ being a nonsingular matrix. From Eq. (15), we have that $\mathbf{a}={\mathcal{H}}_{s}^{-1}\left[{M}^{-1}{\mathcal{H}}_{s}\left[\mathbf{b}\right]\right]$. Therefore, the inverse operator ${\mathcal{P}}_{M,s}^{-1}$ is given by

## Eq. (17)

${\mathcal{P}}_{M,s}^{-1}\left[\mathbf{x}\right]={\mathcal{P}}_{{M}^{-1},s}\left[\mathbf{x}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}\mathrm{det}\text{\hspace{0.17em}}M\ne 0,$
where $\mathrm{det}\text{\hspace{0.17em}}M$ denotes the determinant of $M$.

Using the equations in Eq. (12), the operator ${\mathcal{P}}_{M,s}$ can be expressed in terms of the projection operator, namely

## Eq. (18)

${\mathcal{P}}_{M,s}\left[\mathbf{x}\right]={\mathcal{P}}_{{\mathrm{\Xi }}_{s}^{-1}M{\mathrm{\Xi }}_{s}}\left[\mathbf{x}\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}s\ne 0.$
Note that $M$ and ${\mathrm{\Xi }}_{s}^{-1}M{\mathrm{\Xi }}_{s}$ are similar matrices. Some useful equalities of the defined operators are summarized in Table 1. For a more comprehensible reading of this paper, the reader is encouraged to demonstrate all the equalities in Table 1.

## Table 1

Some useful equalities of the operators S, Hs, and PM,s. In all cases, we consider λ≠0, γ1 and γ2 are any scalars, x is a n-dimensional vector as given in Eq. (1), Ξs is the matrix defined in Eq. (13), y=λHs[x], M is a matrix of size (n+1)×(n+1), and W is a matrix of size m×(n+1).

PropertyDescription
(P1)${\mathcal{H}}_{s}^{-1}\left[{\mathcal{H}}_{s}\left[\mathbf{x}\right]\right]=\mathbf{x}$$s\ne 0$
(P2)${\mathcal{H}}_{s}\left[{\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]\right]=\frac{s}{\mathcal{S}\left[\mathbf{y}\right]}\mathbf{y}$$s\ne 0$
(P3)$\mathbf{x}={\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]↔\frac{\mathcal{S}\left[\mathbf{y}\right]}{s}{\mathcal{H}}_{s}\left[\mathbf{x}\right]=\mathbf{y}$$s\ne 0$
(P4)${\mathcal{H}}_{s}\left[\lambda \mathbf{x}\right]=\lambda {\mathcal{H}}_{s/\lambda }\left[\mathbf{x}\right]$
(P5)$\lambda {\mathcal{H}}_{s}\left[\mathbf{x}\right]={\mathcal{H}}_{\lambda s}\left[\lambda \mathbf{x}\right]$
(P6)${\mathcal{H}}_{s}\left[W{\mathcal{H}}_{s}\left[\mathbf{x}\right]\right]=\left[\begin{array}{c}W\\ \mathcal{H}{\left[{\mathbf{0}}_{n}\right]}^{T}\end{array}\right]{\mathcal{H}}_{s}\left[\mathbf{x}\right]$
(P7)${\mathcal{H}}_{s}\left[{\mathbf{x}}_{1}±{\mathbf{x}}_{2}\right]={\mathcal{H}}_{{s}_{1}}\left[{\mathbf{x}}_{1}\right]±{\mathcal{H}}_{{s}_{2}}\left[{\mathbf{x}}_{2}\right]$${s}_{1}±{s}_{2}=s$
(P8)${\mathcal{H}}_{s}^{-1}\left[\lambda \mathbf{y}\right]={\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]$$s\ne 0$
(P9)$\lambda {\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]={\mathcal{H}}_{\lambda s}^{-1}\left[\mathbf{y}\right]$$s\ne 0$
(P10)${\mathcal{H}}_{0}^{-1}\left[{\gamma }_{1}{\mathbf{y}}_{1}+{\gamma }_{2}{\mathbf{y}}_{2}\right]={\gamma }_{1}{\mathcal{H}}_{0}^{-1}\left[{\mathbf{y}}_{1}\right]+{\gamma }_{2}{\mathcal{H}}_{0}^{-1}\left[{\mathbf{y}}_{2}\right]$
(P11)$\mathcal{S}\left[{\gamma }_{1}{\mathbf{x}}_{1}+{\gamma }_{2}{\mathbf{x}}_{2}\right]={\gamma }_{1}\mathcal{S}\left[{\mathbf{x}}_{1}\right]+{\gamma }_{2}\mathcal{S}\left[{\mathbf{x}}_{2}\right]$
(P12)${\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{1}+{\mathbf{y}}_{2}\right]=\frac{1}{\mathcal{S}\left[{\mathbf{y}}_{1}+{\mathbf{y}}_{2}\right]}\left(\mathcal{S}\left[{\mathbf{y}}_{1}\right]{\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{1}\right]+\mathcal{S}\left[{\mathbf{y}}_{2}\right]{\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{2}\right]\right)$$s\ne 0$
(P13)${\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{1}+{\mathbf{y}}_{2}\right]-{\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{1}\right]=\frac{\mathcal{S}\left[{\mathbf{y}}_{2}\right]}{\mathcal{S}\left[{\mathbf{y}}_{1}+{\mathbf{y}}_{2}\right]}\left({\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{2}\right]-{\mathcal{H}}_{s}^{-1}\left[{\mathbf{y}}_{1}\right]\right)$$s\ne 0$
(P14)${\mathcal{H}}_{s}\left[\mathbf{x}\right]={\mathrm{\Xi }}_{s}\mathcal{H}\left[\mathbf{x}\right]$
(P15)${\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]={\mathcal{H}}^{-1}\left[{\mathrm{\Xi }}_{s}^{-1}\mathbf{y}\right]$$s\ne 0$
(P16)${\mathcal{P}}_{\lambda M,s}\left[\mathbf{x}\right]={\mathcal{P}}_{M,s}\left[\mathbf{x}\right]$$s\ne 0$
(P17)${\mathcal{P}}_{M,s}^{-1}\left[\mathbf{x}\right]={\mathcal{P}}_{{M}^{-1},s}\left[\mathbf{x}\right]$$\mathrm{det}M\ne 0$
(P18)${\mathbf{x}}_{2}={\mathcal{P}}_{M,s}\left[{\mathbf{x}}_{1}\right]↔{\mathcal{P}}_{{M}^{-1},s}\left[{\mathbf{x}}_{2}\right]={\mathbf{x}}_{1}$$\mathrm{det}\text{\hspace{0.17em}}M\ne 0$
(P19)${\mathcal{P}}_{M,s}\left[\lambda \mathbf{x}\right]=\lambda {\mathcal{P}}_{M,s/\lambda }\left[\mathbf{x}\right]$
(P20)$\lambda {\mathcal{P}}_{M,s}\left[\mathbf{x}\right]={\mathcal{P}}_{M,\lambda s}\left[\lambda \mathbf{x}\right]$
(P21)${\mathcal{P}}_{{M}_{2},s}\left[{\mathcal{P}}_{{M}_{1},s}\left[\mathbf{x}\right]\right]={\mathcal{P}}_{{M}_{2}{M}_{1},s}\left[\mathbf{x}\right]$
(P22)${\mathcal{P}}_{{\mathbb{I}}_{n},s}\left[\mathbf{x}\right]=\mathbf{x}$
(P23)${\mathcal{P}}_{M,s}\left[\mathbf{x}\right]={\mathcal{P}}_{{\mathrm{\Xi }}_{s}^{-1}M{\mathrm{\Xi }}_{s}}\left[\mathbf{x}\right]$$s\ne 0$
(P24)${\mathcal{H}}_{s}\left[{\mathcal{P}}_{M,s}\left[\mathbf{x}\right]\right]=\frac{s}{\mathcal{S}\left[M\mathcal{H}\left[\mathbf{x}\right]\right]}M\mathcal{H}\left[\mathbf{x}\right]$$s\ne 0$
(P25)${\mathcal{P}}_{M,s}\left[{\mathcal{H}}_{s}^{-1}\left[\mathbf{y}\right]\right]={\mathcal{H}}_{s}^{-1}\left[M\mathbf{y}\right]$$s\ne 0$

In the following sections, the defined operators are studied from an intuitive geometrical approach for the 1-D and 2-D cases. Then, the usefulness of this theoretical framework is illustrated by addressing the perspective correction problem for camera document scanning.

## One-Dimensional Space

The 1-D real space can be represented as a line as shown in Fig. 1(a). In this space, a point at a finite distance from the origin is represented by a real number $x$; otherwise, the point is represented by the symbol $\infty$.

## Fig. 1

(a) The real line as the 1-D Euclidean space. (b) The 1-D space represented by the projective line ($y=1$) in a 2-D Euclidean space. Alternatively, the 1-D space can be represented by the projective line $y=1$ in the $xy$-plane as shown in Fig. 1(b). Thus, the coordinate $x$ of a point in the line becomes the vector

## Eq. (19)

$\mathbf{y}=\mathcal{H}\left[x\right]=\left[\begin{array}{c}x\\ 1\end{array}\right].$
The coordinate $x$ can be recovered from its homogeneous version $\mathbf{y}$ as the intersection between the line $y=1$ and the line with direction $\mathcal{H}\left[x\right]$ passing through the origin as shown in Fig. 1(b). This is described mathematically as

## Eq. (20)

$x={\mathcal{H}}^{-1}\left[\mathbf{y}\right].$
Note that the result is invariant to the scalar multiplication of $\mathbf{y}$ by a nonzero scalar [e.g., $\lambda$ and $-\gamma$ as shown in Fig. 1(b)] because the intersection between lines is unaltered. In other words, ${\mathcal{H}}^{-1}\left[\lambda \mathbf{y}\right]={\mathcal{H}}^{-1}\left[-\gamma \mathbf{y}\right]={\mathcal{H}}^{-1}\left[\mathbf{y}\right]$ as stated by Eq. (11).

## 3.1.

### Ideal Point

Homogeneous coordinates provide a different form to identify points of the real line. Consider the unit vector

## Eq. (21)

$\mathbf{u}\left(\theta \right)=\left[\begin{array}{c}\mathrm{sin}\text{\hspace{0.17em}}\theta \\ \mathrm{cos}\text{\hspace{0.17em}}\theta \end{array}\right].$
Thus, the homogeneous representation of $x$ given by Eq. (19) becomes

## Eq. (22)

$\mathcal{H}\left[x\right]=\lambda \mathbf{u}\left(\theta \right),$
where ${\lambda }^{2}=1+{x}^{2}$ and $\mathrm{tan}\text{\hspace{0.17em}}\theta =x$. From Eq. (22), we obtain

## Eq. (23)

$x={\mathcal{H}}^{-1}\left[\mathbf{u}\left(\theta \right)\right].$
Since a vector $\mathbf{u}$ and its opposite $-\mathbf{u}$ represent the same point (i.e., ${\mathcal{H}}^{-1}\left[\mathbf{u}\right]={\mathcal{H}}^{-1}\left[-\mathbf{u}\right]$), all points of the real line at a finite distance from the origin are associated to a unique angle $\theta$ in the open interval $\left(-\pi /2,\pi /2\right)$; i.e., the vectors $\mathbf{u}$ different to [1, 0] and $\left[-1,0\right]$ in the quadrants I and II, as shown in Fig. 2.

## Fig. 2

Representation of points of the real line using homogeneous coordinates. Opposite homogeneous vectors represent the same point; thus, there is a single point at infinity, given by $\mathbf{u}\left(\pi /2\right)={\left[1,0\right]}^{T}$. Intuitively, the real line in Euclidean representation has two points at infinity, namely $-\infty$ and $+\infty$. However, in projective geometry, the real line has only a single point at infinity given by the homogeneous coordinates

## Eq. (24)

$\mathbf{\psi }=\left[\begin{array}{c}1\\ 0\end{array}\right],$
which is associated to $\mathbf{u}\left(\pi /2\right)$, as shown in Fig. 2. It could be argued that $\pi /2$ corresponds to $+\infty$ while $-\pi /2$ to $-\infty$. However, note that $\mathbf{u}\left(-\pi /2\right)=-\mathbf{\psi }$ is the opposite of $\mathbf{\psi }$. Hence, they represent the same point.

Note that ${\mathcal{H}}^{-1}\left[\mathbf{\psi }\right]=1/0$ is consistent with the notion that $\mathbf{\psi }$ represents a point at infinity distance from the origin. According to the concepts of projective geometry, the vector $\mathbf{\psi }$ represents an ideal point, see Eq. (5).

## 3.2.

### One-Dimensional Projection

The line $y=0$ can be transformed to any other line by applying a rotation $Q=\left[{\mathbf{q}}_{1},{\mathbf{q}}_{2}\right]$ and a translation $\mathbf{s}$. Thus, a point in the line $y=0$, represented by the scalar $x$, becomes a point in the $xy$-plane given by the vector

## Eq. (25)

$\mathbf{p}=q{\mathcal{H}}_{0}\left[x\right]+\mathbf{s}\phantom{\rule{0ex}{0ex}}={\mathrm{\Pi }}_{1}\mathcal{H}\left[x\right],$
where the matrix ${\mathrm{\Pi }}_{1}$ will be referred to as the reference line parameters and has the explicit form

## Eq. (26)

${\mathrm{\Pi }}_{1}=\left[\begin{array}{cc}{\mathbf{q}}_{1}& \mathbf{s}\end{array}\right].$
The first column of ${\mathrm{\Pi }}_{1}$ and the determinant $\mathrm{det}\text{\hspace{0.17em}}{\mathrm{\Pi }}_{1}$ provide the direction of the reference line and its distance from the origin, respectively.

If the matrix ${\mathrm{\Pi }}_{1}$ is singular, the vectors ${\mathbf{q}}_{1}$ and $\mathbf{s}$ are collinear. In this case, the origin is a point of the transformed line (the distance of the line from the origin is zero). The matrix ${\mathrm{\Pi }}_{1}$ is nonsingular when ${\mathbf{q}}_{1}$ and $\mathbf{s}$ are linearly independent. In this case, the origin is not a point of the transformed line.

Let $\mathbf{p}$ in Eq. (25) be the homogeneous coordinates of a point $\alpha$ in the line. Thus, we obtain the 1-D projection

## Eq. (27)

$\alpha ={\mathcal{H}}^{-1}\left[\mathbf{p}\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[{\mathrm{\Pi }}_{1}\mathcal{H}\left[x\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{P}}_{{\mathrm{\Pi }}_{1}}\left[x\right].$
The transformations by ${\mathcal{P}}_{{\mathrm{\Pi }}_{1}}\left[x\right]$ and its inverse ${\mathcal{P}}_{{\mathrm{\Pi }}_{1}^{-1}}\left[\alpha \right]$ are shown in Fig. 3.

## 4.1.

### Points and Lines in the Plane

Any point in the 2-D space can be represented as the vector

## Eq. (28)

$\mathbf{x}={\left[\begin{array}{cc}{x}_{1}& {x}_{2}\end{array}\right]}^{T}.$
Moreover, the point $\mathbf{x}$ can be represented by its homogeneous coordinates

## Eq. (29)

$\mathcal{H}\left[\mathbf{x}\right]=\left[\begin{array}{c}\mathbf{x}\\ 1\end{array}\right],$
as shown in Fig. 4(a). Note that $\mathcal{H}$ takes the 2-D vector $\mathbf{x}$ (in the plane $z=0$) and converts it to the 3-D vector $\mathcal{H}\left[\mathbf{x}\right]$, where $\mathbf{x}$ is unaltered but now it lies in the projective plane $z=1$. It is worth mentioning that the vector $\mathbf{x}$ can be recovered from $\lambda \mathcal{H}\left[\mathbf{x}\right]$ as the point of intersection of the line with points ${\mathbf{0}}_{3}$ and $\lambda \mathcal{H}\left[\mathbf{x}\right]$, see Eq. (11). That is,

## Eq. (30)

$\mathbf{x}={\mathcal{H}}^{-1}\left[\lambda \mathcal{H}\left[\mathbf{x}\right]\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}\lambda \ne 0.$
A line in the plane $xy$-plane can be written as the homogeneous equation

## Eq. (31)

${l}_{1}{x}_{1}+{l}_{2}{x}_{2}+{l}_{3}=0,$
where ${l}_{1}$, ${l}_{2}$, and ${l}_{3}$ are coefficients. Using homogeneous coordinates, Eq. (31) becomes

## Eq. (32)

${\mathbf{l}}^{T}\mathcal{H}\left[\mathbf{x}\right]=0,$
where $\mathbf{x}={\left[{x}_{1},{x}_{2}\right]}^{T}$ is a point of the line and $\mathbf{l}={\left[{l}_{1},{l}_{2},{l}_{3}\right]}^{T}$ is the vector that defines the line. Equation (32) exhibits that $\mathbf{l}$ and $\mathcal{H}\left[\mathbf{x}\right]$ are orthogonal vectors. Note that the vector $\mathbf{l}$ is unique up to scale, i.e., the vectors $\mathbf{l}$ and $\lambda \mathbf{l}$, with $\lambda \ne 0$, represent the same line.

## Fig. 4

(a) The 2-D space represented by the projective plane ($z=1$). (b) Parallel lines in the plane. Let ${\mathbf{x}}_{1}$ and ${\mathbf{x}}_{2}$ be two different points in the $xy$-plane. The vector $\mathbf{l}$ of the line passing through ${\mathbf{x}}_{1}$ and ${\mathbf{x}}_{2}$ can be obtained by the cross product as

## Eq. (33)

$\mathbf{l}=\mathcal{H}\left[{\mathbf{x}}_{1}\right]×\mathcal{H}\left[{\mathbf{x}}_{2}\right].$
By definition of the cross product, the vector $\mathbf{l}$ is orthogonal to $\mathcal{H}\left[{\mathbf{x}}_{1}\right]$ and $\mathcal{H}\left[{\mathbf{x}}_{2}\right]$. Therefore, these vectors satisfy Eq. (32).

Consider two lines defined by the vectors ${\mathbf{l}}_{1}$ and ${\mathbf{l}}_{2}$. If $\mathbf{x}$ is the intersection point of these lines, then $\mathcal{H}\left[\mathbf{x}\right]$ is orthogonal to ${\mathbf{l}}_{1}$ and ${\mathbf{l}}_{2}$. That is

## Eq. (34)

$\lambda \mathcal{H}\left[\mathbf{x}\right]={\mathbf{l}}_{1}×{\mathbf{l}}_{2},$
where $\lambda =\mathcal{S}\left[{\mathbf{l}}_{1}×{\mathbf{l}}_{2}\right]$. Therefore, the intersection point of the lines ${\mathbf{l}}_{1}$ and ${\mathbf{l}}_{2}$ is

## Eq. (35)

$\mathbf{x}={\mathcal{H}}^{-1}\left[{\mathbf{l}}_{1}×{\mathbf{l}}_{2}\right].$

## 4.2.

### Parallel Lines

Two different lines are parallel if its defining vectors are of the form

## Eq. (36)

$\mathbf{l}=\left[\begin{array}{c}{l}_{1}\\ {l}_{2}\\ {l}_{3}\end{array}\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}\overline{\mathbf{l}}=\lambda \left[\begin{array}{c}{l}_{1}\\ {l}_{2}\\ {l}_{3}+\delta \end{array}\right],$
where $\lambda$, $\delta \ne 0$. This can be verified as follows. Consider two parallel lines in the plane with points $\mathbf{\alpha }$ and $\mathbf{\beta }$ given, respectively, by

## Eq. (37)

$\mathbf{\alpha }=\mathbf{a}+\gamma \mathbf{d},\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{and}\phantom{\rule{0ex}{0ex}}\mathbf{\beta }=\mathbf{a}+\gamma \mathbf{d}+\delta \mathbf{t},$
where $\gamma$ is a parameter, $\delta \ne 0$ is a constant, $\mathbf{a}$ is a reference point, $\mathbf{d}$ is a unit vector with direction of the line, and $\mathbf{t}$ is a unit vector orthogonal to $\mathbf{d}$, i.e.,

## Eq. (38)

${\mathcal{H}}_{0}\left[\mathbf{t}\right]×{\mathcal{H}}_{0}\left[\mathbf{d}\right]=\mathcal{H}\left[{\mathbf{0}}_{2}\right],$
as shown in Fig. 4(b). Two points of each line are
${\mathbf{\alpha }}_{1}=\mathbf{a}+{\gamma }_{1}\mathbf{d},\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{\beta }}_{1}=\mathbf{a}+{\overline{\gamma }}_{1}\mathbf{d}+\delta \mathbf{t},\phantom{\rule{0ex}{0ex}}{\mathbf{\alpha }}_{2}=\mathbf{a}+{\gamma }_{2}\mathbf{d},\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{\beta }}_{2}=\mathbf{a}+{\overline{\gamma }}_{2}\mathbf{d}+\delta \mathbf{t}.$
Thus, the vector of the line with points $\mathbf{\alpha }$ is

## Eq. (39)

$\mathbf{l}=\mathcal{H}\left[{\mathbf{\alpha }}_{1}\right]×\mathcal{H}\left[{\mathbf{\alpha }}_{2}\right]\phantom{\rule{0ex}{0ex}}=\left({\gamma }_{2}-{\gamma }_{1}\right)\mathcal{H}\left[\mathbf{a}\right]×{\mathcal{H}}_{0}\left[\mathbf{d}\right],$
or, since the line is unaffected by scaling of its vector

## Eq. (40)

$\mathbf{l}=\mathcal{H}\left[\mathbf{a}\right]×{\mathcal{H}}_{0}\left[\mathbf{d}\right].$
Similarly, the vector of the line with points $\mathbf{\beta }$ is

## Eq. (41)

$\overline{\mathbf{l}}=\mathcal{H}\left[{\mathbf{\beta }}_{1}\right]×\mathcal{H}\left[{\mathbf{\beta }}_{2}\right]\phantom{\rule{0ex}{0ex}}=\left({\overline{\gamma }}_{2}-{\overline{\gamma }}_{1}\right)\left(\mathbf{l}+\delta \mathcal{H}\left[{\mathbf{0}}_{2}\right]\right)\phantom{\rule{0ex}{0ex}}=\lambda \left(\mathbf{l}+\delta \mathcal{H}\left[{\mathbf{0}}_{2}\right]\right),$
where $\lambda ={\overline{\gamma }}_{2}-{\overline{\gamma }}_{1}$. Therefore, the vectors $\mathbf{l}$ and $\overline{\mathbf{l}}$ given in Eq. (36) represent parallel lines.

It is worth mentioning that, if $\mathbf{l}$ is the vector of a line with direction $\mathbf{d}$ [see Eq. (40)], then the vector ${\mathcal{H}}_{0}^{-1}\left[\mathbf{l}\right]$ is orthogonal to $\mathbf{d}$, namely

## Eq. (42)

${\mathbf{d}}^{T}{\mathcal{H}}_{0}^{-1}\left[\mathbf{l}\right]={\mathbf{d}}^{T}{\mathcal{H}}_{0}^{-1}\left[\mathcal{H}\left[\mathbf{a}\right]×{\mathcal{H}}_{0}\left[\mathbf{d}\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}_{0}{\left[\mathbf{d}\right]}^{T}\left(\mathcal{H}\left[\mathbf{a}\right]×{\mathcal{H}}_{0}\left[\mathbf{d}\right]\right)\phantom{\rule{0ex}{0ex}}=0.$

## 4.3.

### Ideal Points and the Line at Infinity

In the Euclidean geometry, two parallel lines in the plane do not intersect. However, in the projective geometry, two different lines always intersect at a point. Consider the parallel lines given by the vectors in Eq. (36). Using Eq. (35), the intersection point is

## Eq. (43)

${\mathcal{H}}^{-1}\left[\mathbf{l}×\overline{\mathbf{l}}\right]={\mathcal{H}}^{-1}\left[\mathcal{H}\left[{\mathbf{0}}_{2}\right]×\mathbf{l}\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[\mathbf{\psi }\right],$
where

## Eq. (44)

$\mathbf{\psi }={\left[\begin{array}{ccc}-{l}_{2}& {l}_{1}& 0\end{array}\right]}^{T}$
is the point of intersection in homogeneous coordinates. Note that ${\mathcal{H}}^{-1}\left[\mathbf{\psi }\right]=\left[-{l}_{2}/0,{l}_{1}/0\right]$ provides the insight that parallel lines intersecting at a point at infinity. As in the 1-D case, the vector $\mathbf{\psi }$ represents ideal points, see Eq. (5).

The vector $\mathbf{\psi }$ is associated with the direction $\mathbf{d}$ of the line $\mathbf{l}$. This is verified by taking into account that ${\mathcal{H}}_{0}^{-1}\left[\mathbf{l}\right]$ is orthogonal to $\mathbf{d}$ [Eq. (42)] as well as to ${\mathcal{H}}_{0}^{-1}\left[\mathbf{\psi }\right]$ (${\mathcal{H}}_{0}^{-1}{\left[\mathbf{\psi }\right]}^{T}{\mathcal{H}}_{0}^{-1}\left[\mathbf{l}\right]=0$), then

## Eq. (45)

${\mathcal{H}}_{0}^{-1}\left[\mathbf{\psi }\right]=\lambda \mathbf{d},$
where $\lambda$ is some nonzero scalar.

All ideal points given by Eq. (44) are collinear. The vector of such a line, known as the line at infinity, is

## Eq. (46)

${\mathbf{l}}_{\infty }=\mathcal{H}\left[{\mathbf{0}}_{2}\right]={\left[\begin{array}{ccc}0& 0& 1\end{array}\right]}^{T}.$
This can be easily verified by ${\mathbf{l}}_{\infty }^{T}\mathbf{\psi }=0$ as required by Eq. (32).

The ideal point $\mathbf{\psi }$ in Eq. (44) was obtained as the intersection of two parallel lines $\mathbf{l}$ and $\overline{\mathbf{l}}$. However, the intuition suggests that the same result could be obtained by computing the intersection of the line $\mathbf{l}$ and the line at infinity ${\mathbf{l}}_{\infty }$. In fact, we have that

## Eq. (47)

$\mathbf{\psi }=\mathbf{l}×{\mathbf{l}}_{\infty }.$
Thus, using Eq. (45), the direction $\mathbf{d}$ of any line $\mathbf{l}$ is given by

## Eq. (48)

$\lambda \mathbf{d}={\mathcal{H}}_{0}^{-1}\left[\mathbf{l}×{\mathbf{l}}_{\infty }\right],$
where $\lambda$ is a nonzero scale factor. For this reason, the line ${\mathbf{l}}_{\infty }$ is interpreted as the set of directions of lines in the plane.

Similar to the 1-D case, homogeneous coordinates provide a different form to identify points of the plane. Consider the unit vector

## Eq. (49)

$\mathbf{v}={\left[\begin{array}{ccc}\mathrm{sin}\text{\hspace{0.17em}}\theta \text{\hspace{0.17em}}\mathrm{cos}\text{\hspace{0.17em}}\varphi & \mathrm{sin}\text{\hspace{0.17em}}\theta \text{\hspace{0.17em}}\mathrm{sin}\text{\hspace{0.17em}}\varphi & \mathrm{cos}\text{\hspace{0.17em}}\theta \end{array}\right]}^{T},$
where $\theta$ and $\varphi$ are polar and azimuth angles, respectively. Thus, the homogeneous coordinates for each point $\mathbf{x}={\left[{x}_{1},{x}_{2}\right]}^{T}$ on the plane are given by

## Eq. (50)

$\mathcal{H}\left[\mathbf{x}\right]=\lambda \mathbf{v},$
where ${\lambda }^{2}=1+{x}_{1}^{2}+{x}_{2}^{2}$. From Eq. (50), the following relation holds

## Eq. (51)

$\mathbf{x}={\mathcal{H}}^{-1}\left[\mathbf{v}\left(\theta ,\varphi \right)\right].$

The points of the plane at a finite distance from the origin are given by $\mathbf{v}\left(\theta ,\varphi \right)$ with $\theta \in \left[0,\pi /2\right)$ and $\varphi \in \left[-\pi ,\pi \right)$, i.e., the upper hemisphere of the unit sphere, see Fig. 5. The points of the plane at infinity distance from the origin are parameterized by $\theta =\pi /2$ and $\varphi \in \left(-\pi /2,\pi /2\right]$. These points have the homogeneous coordinates

## Eq. (52)

${\mathbf{v}}_{\infty }={\left[\begin{array}{ccc}\mathrm{cos}\text{\hspace{0.17em}}\varphi & \mathrm{sin}\text{\hspace{0.17em}}\varphi & 0\end{array}\right]}^{T},$
see Eq. (44). That is, the ideal points are represented by the half equator of the unit sphere, see yellow line in Fig. 5.

## Fig. 5

Representation of points of the plane using homogeneous coordinates $\mathbf{v}$. The upper hemisphere represents points of the plane at a finite distance from the origin, and the half of the equator (yellow semicircle) represents points at infinity. ## 4.4.

### Two-Dimensional Projection

Any plane in the 3-D space can be obtained as the plane $z=0$ after a rotation $Q=\left[{\mathbf{q}}_{1},{\mathbf{q}}_{2},{\mathbf{q}}_{3}\right]$ and a translation $\mathbf{s}$. Thus, the points represented by $\mathbf{x}={\left[{x}_{1},{x}_{2}\right]}^{T}$, becomes

## Eq. (53)

$\mathbf{p}=Q{\mathcal{H}}_{0}\left[\mathbf{x}\right]+\mathbf{s}\phantom{\rule{0ex}{0ex}}={\mathrm{\Pi }}_{2}\mathcal{H}\left[\mathbf{x}\right],$
where the matrix ${\mathrm{\Pi }}_{2}$ will be referred to as the reference plane parameters and has the explicit form

## Eq. (54)

${\mathrm{\Pi }}_{2}=\left[\begin{array}{ccc}{\mathbf{q}}_{1}& {\mathbf{q}}_{2}& \mathbf{s}\end{array}\right].$
The cross product of the first two columns of ${\mathrm{\Pi }}_{2}$ and $\mathrm{det}\text{\hspace{0.17em}}{\mathrm{\Pi }}_{2}$ provides the normal to the reference plane and its distance from the origin, respectively.

The matrix ${\mathrm{\Pi }}_{2}$ is singular when ${\mathbf{r}}_{1}$, ${\mathbf{r}}_{2}$, and $\mathbf{t}$ are coplanar. In this case, the origin ${\mathbf{0}}_{3}$ is a point of the transformed plane (the distance of the reference plane from the origin is zero). Otherwise, ${\mathrm{\Pi }}_{2}$ is a nonsingular.

Let $\mathbf{p}$ in Eq. (53) be the homogeneous coordinates of a point $\mathbf{\alpha }$ in the projective plane. Thus, the relation between the points $\mathbf{\alpha }$ and $\mathbf{x}$ is given by the 2-D projection ${\mathcal{P}}_{{\mathrm{\Pi }}_{2}}$, namely

## Eq. (55)

$\mathbf{\alpha }={\mathcal{H}}^{-1}\left[\mathbf{p}\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[{\mathrm{\Pi }}_{2}\mathcal{H}\left[\mathbf{x}\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{P}}_{{\mathrm{\Pi }}_{2}}\left[\mathbf{x}\right].$
The projection ${\mathcal{P}}_{{\mathrm{\Pi }}_{2}}\left[\mathbf{x}\right]$ and its inverse ${\mathcal{P}}_{{\mathrm{\Pi }}_{2}^{-1}}\left[\mathbf{\alpha }\right]$ are shown in Fig. 6.

## 4.5.

### Properties of the Two-Dimensional Projection

As shown in Fig. 6, the 2-D projection ${\mathcal{P}}_{{\mathrm{\Pi }}_{2}}$ excludes several geometrical properties; e.g., shape, angles, lengths, and ratio of lengths. Fortunately, there are some geometrical properties that are preserved. Particularly, we are interested in three of them that are very useful in practice: namely straightness, line–line intersection, and parallelism of the normal and line at infinity vectors.

## 4.5.1.

#### Straightness property

This property states that a 2-D projection transforms lines to lines.12 This can be shown as follows. Consider a line with vector $\mathbf{l}$ and points $\mathbf{x}$, that is

## Eq. (56)

$0={\mathbf{l}}^{T}\mathcal{H}\left[\mathbf{x}\right].$
Next, the points $\mathbf{x}$ are transformed to $\mathbf{\alpha }$ by Eq. (55). Solving Eq. (55) for $\mathbf{x}$ and substituting in Eq. (56), we obtain

## Eq. (57)

$0={\mathbf{l}}^{T}\mathcal{H}\left[{\mathcal{P}}_{{\mathrm{\Pi }}_{2}^{-1}}\left[\mathbf{\alpha }\right]\right]=\frac{{\mathbf{l}}^{T}{\mathrm{\Pi }}_{2}^{-1}\mathcal{H}\left[\mathbf{\alpha }\right]}{\mathcal{S}\left[{\mathrm{\Pi }}_{2}^{-1}\mathcal{H}\left[\mathbf{\alpha }\right]\right]},$
or

## Eq. (58)

$0={\mathbf{m}}^{T}\mathcal{H}\left[\mathbf{\alpha }\right],$
where

## Eq. (59)

$\mathbf{m}={\mathrm{\Pi }}_{2}^{-T}\mathbf{l},$
with ${\mathrm{\Pi }}_{2}^{-T}$ being the abbreviation of ${\left({\mathrm{\Pi }}_{2}^{-1}\right)}^{T}$ or ${\left({\mathrm{\Pi }}_{2}^{T}\right)}^{-1}$. In summary, the points $\mathbf{x}$ of a line $\mathbf{l}$ are transformed by ${\mathcal{P}}_{{\mathrm{\Pi }}_{2}}$ to points $\mathbf{\alpha }$ of a new line $\mathbf{m}$.

## 4.5.2.

#### Line–line intersection

Preservation of the line–line intersection by a 2-D projection refers to the following. If

## Eq. (60)

${\mathbf{x}}_{0}={\mathcal{H}}^{-1}\left[{\mathbf{l}}_{1}×{\mathbf{l}}_{2}\right]$
is the point where the lines ${\mathbf{l}}_{1}$ and ${\mathbf{l}}_{2}$ intersect, then

## Eq. (61)

${\mathbf{\alpha }}_{0}={\mathcal{P}}_{{\mathrm{\Pi }}_{2}}\left[{\mathbf{x}}_{0}\right]$
is the point where intersect the lines

## Eq. (62)

${\mathbf{m}}_{1}={\mathrm{\Pi }}_{2}^{-T}{\mathbf{l}}_{1}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{and}\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{m}}_{2}={\mathrm{\Pi }}_{2}^{-T}{\mathbf{l}}_{2}.$
In fact, the lines ${\mathbf{m}}_{1}$ and ${\mathbf{m}}_{2}$ intersect at the point

## Eq. (63)

${\mathbf{\alpha }}_{0}={\mathcal{H}}^{-1}\left[{\mathbf{m}}_{1}×{\mathbf{m}}_{2}\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[\left({\mathrm{\Pi }}_{2}^{-T}{\mathbf{l}}_{1}\right)×\left({\mathrm{\Pi }}_{2}^{-T}{\mathbf{l}}_{2}\right)\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[{\mathrm{\Pi }}_{2}\left({\mathbf{l}}_{1}×{\mathbf{l}}_{2}\right)\right],$
where the identity of the cross product

## Eq. (64)

$\left(M\mathbf{u}\right)×\left(M\mathbf{v}\right)=\left(\mathrm{det}\text{\hspace{0.17em}}M\right){M}^{-T}\left(\mathbf{u}×\mathbf{v}\right)$
was applied. By solving Eq. (60) for ${\mathbf{l}}_{1}×{\mathbf{l}}_{2}$ and substituting in Eq. (63), we obtain

## Eq. (65)

${\mathbf{\alpha }}_{0}={\mathcal{H}}^{-1}\left[\mathcal{S}\left[{\mathbf{l}}_{1}×{\mathbf{l}}_{2}\right]{\mathrm{\Pi }}_{2}\mathcal{H}\left[{\mathbf{x}}_{0}\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[{\mathrm{\Pi }}_{2}\mathcal{H}\left[{\mathbf{x}}_{0}\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{P}}_{{\mathrm{\Pi }}_{2}}\left[{\mathbf{x}}_{0}\right].$

## 4.5.3.

#### Parallelism of the normal and line at infinity vectors

The normal of the $xy$-plane and the vector ${\mathbf{l}}_{\infty }$ of the line at infinity are parallel. When the projection ${\mathcal{P}}_{{\mathrm{\Pi }}_{2}}$ is applied, the normal ${\mathbf{q}}_{3}$ of the reference plane (with parameters ${\mathrm{\Pi }}_{2}$) and the new line at infinity ${\mathbf{m}}_{\infty }$ still remain parallel; i.e., ${\mathbf{m}}_{\infty }=\lambda {\mathbf{q}}_{3}$, $\lambda \ne 0$. Actually, the reference plane has the normal

## Eq. (66)

${\mathbf{q}}_{3}={\mathbf{q}}_{1}×{\mathbf{q}}_{2},$
see Eq. (54), whereas the vector of the new line at infinity is

## Eq. (67)

${\mathbf{m}}_{\infty }={\mathrm{\Pi }}_{2}^{-T}{\mathbf{l}}_{\infty }\phantom{\rule{0ex}{0ex}}=\lambda \left(\mathrm{cof}\text{\hspace{0.17em}}{\mathrm{\Pi }}_{2}\right){\mathbf{l}}_{\infty }\phantom{\rule{0ex}{0ex}}=\lambda \left[\begin{array}{ccc}{\mathbf{q}}_{2}×\mathbf{s}& -{\mathbf{q}}_{1}×\mathbf{s}& {\mathbf{q}}_{1}×{\mathbf{q}}_{2}\end{array}\right]{\mathbf{l}}_{\infty }\phantom{\rule{0ex}{0ex}}=\lambda {\mathbf{q}}_{3},$
where $\lambda =1/\mathrm{det}\text{\hspace{0.17em}}{\mathrm{\Pi }}_{2}$, and $\mathrm{cof}\left(·\right)$ denotes the cofactor matrix.

In the following section, the developed theoretical framework is applied in a real problem.

## Pinhole Camera Model

In practice, the imaging process is performed by a camera lens device as shown in Fig. 7(a). This device produces high quality images because of a complicated system of lenses that minimizes aberration and distortion. However, the imaging process can be modeled using a single thin lens as shown in Fig. 7(b). Moreover, the imaging model can be easily derived using the equivalent pinhole camera as shown in Fig. 7(c).

## Fig. 7

(a) Illustration of a camera lens. (b) The imaging process modeled using a single thin lens. (c) A pinhole camera. The planes $z=-f$ and $z=f$ are the actual and conjugate image planes, respectively. In the pinhole camera, the origin of a coordinate system is fixed at the pinhole and the $z$-axis is parallel to the optical axis. The plane $z=-f$, where $f$ is the focal length, is the actual image plane. Note that the image is inverted; therefore, the $x$ and $y$ axes are reverted to describe the image as a magnified version of the object. The inversion of the axes is avoided using the conjugate image plane $z=f$ as shown in Fig. 7(c).

## 5.1.

### Centered Pinhole Camera

A typical representation of a pinhole camera is shown in Fig. 8. The coordinate system ${O}_{c}{x}_{c}{y}_{c}{z}_{c}$ is known as the camera reference frame. Let

## Eq. (68)

${\mathbf{p}}_{c}={\left[\begin{array}{ccc}{x}_{c}& {y}_{c}& {z}_{c}\end{array}\right]}^{T}$
be the coordinates of a point in the camera reference frame. The point ${\mathbf{p}}_{c}$ will be imaged in the plane ${z}_{c}=f$ at the point

## Eq. (69)

$\mathbf{\beta }={\left[\begin{array}{cc}{\beta }_{x}& {\beta }_{y}\end{array}\right]}^{T},$
where $\mathbf{\beta }$ will be referred to as the physical image coordinates. The pinhole projection model relates the vectors ${\mathbf{p}}_{c}$ and $\mathbf{\beta }$ by

## Eq. (70)

$\left[\begin{array}{c}{\beta }_{x}\\ {\beta }_{y}\\ f\end{array}\right]=\frac{f}{{z}_{c}}\left[\begin{array}{c}{x}_{c}\\ {y}_{c}\\ {z}_{c}\end{array}\right].$
Using homogeneous coordinates for the image point $\mathbf{\beta }$, Eq. (70) can be rewritten as

## Eq. (71)

$\mathbf{\beta }={\mathcal{H}}_{f}^{-1}\left[{\mathbf{p}}_{c}\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[{\mathrm{\Xi }}_{f}^{-1}{\mathbf{p}}_{c}\right].$
The image formed in the sensor of the camera is sampled as an array of pixels. Then, the physical coordinates $\mathbf{\beta }$ will be transformed to the pixel coordinates

## Eq. (72)

$\mathbf{\mu }={\left[\begin{array}{cc}u& v\end{array}\right]}^{T},$
which depend on the size of the pixel and skew (diagonal distortion) as shown in Fig. 8. The sampling can be described as

## Eq. (73)

$u=\left({\beta }_{x}+{\tau }_{x}\right)/{s}_{x}+\sigma {\beta }_{y},\phantom{\rule{0ex}{0ex}}v=\left({\beta }_{y}+{\tau }_{y}\right)/{s}_{y},$
where ${s}_{x}$ and ${s}_{y}$ (with units of length) are the width and height of the pixel, respectively, $\mathbf{\tau }={\left[{\tau }_{x},{\tau }_{y}\right]}^{T}$ is known as the principal point and represents the point (from the $uv$-reference frame) where the optical axis crosses the image plane, and $\sigma$ is the skew factor ($\sigma =0$ for most camera sensors). Equation (73) can be written as

## Eq. (74)

$\left[\begin{array}{c}u\\ v\\ 1\end{array}\right]=\left[\begin{array}{ccc}1/{s}_{x}& \sigma & {\tau }_{x}/{s}_{x}\\ 0& 1/{s}_{y}& {\tau }_{y}/{s}_{y}\\ 0& 0& 1\end{array}\right]\left[\begin{array}{c}{\beta }_{x}\\ {\beta }_{y}\\ 1\end{array}\right],$
or using a compact notation

## Eq. (75)

$\mathbf{\mu }={\mathcal{H}}^{-1}\left[\mathcal{S}\mathcal{H}\left[\mathbf{\beta }\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{P}}_{S}\left[\mathbf{\beta }\right],$
where $S$ is the sampling matrix given as

## Eq. (76)

$S=\left[\begin{array}{ccc}1/{s}_{x}& \sigma & {\tau }_{x}/{s}_{x}\\ 0& 1/{s}_{y}& {\tau }_{y}/{s}_{y}\\ 0& 0& 1\end{array}\right].$
Substituting Eq. (71) into Eq. (75), we obtain the image $\mathbf{\mu }$ (in pixel coordinates) of the point ${\mathbf{p}}_{c}$ as

## Eq. (77)

$\mathbf{\mu }={\mathcal{P}}_{S}\left[{\mathcal{H}}^{-1}\left[{\mathrm{\Xi }}_{f}^{-1}{\mathbf{p}}_{c}\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[S{\mathrm{\Xi }}_{f}^{-1}{\mathbf{p}}_{c}\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[K{\mathbf{p}}_{c}\right],$
where $K=S{\mathrm{\Xi }}_{f}^{-1}$ is known as the matrix of intrinsic camera parameters having the explicit form

## Eq. (78)

$K=\left[\begin{array}{ccc}1/{s}_{x}& \sigma & {\tau }_{x}/f{s}_{x}\\ 0& 1/{s}_{y}& {\tau }_{y}/f{s}_{y}\\ 0& 0& 1/f\end{array}\right].$
Since $\mathrm{det}\text{\hspace{0.17em}}K={\left({s}_{x}{s}_{y}f\right)}^{-1}$, the matrix $K$ is nonsingular for any experimental case.

## Fig. 8

The centered pinhole camera and sampling of the image plane. Given a point $\mathbf{\mu }$ (in pixel coordinates), the actual coordinates ${\mathcal{H}}_{f}\left[\mathbf{\beta }\right]$ of an image point (physical coordinates on the image plane $z=f$) can be obtained from Eq. (75) as

## Eq. (79)

${\mathcal{H}}_{f}\left[\mathbf{\beta }\right]={\mathcal{H}}_{f}\left[{\mathcal{P}}_{{S}^{-1}}\left[\mathbf{\mu }\right]\right]\phantom{\rule{0ex}{0ex}}={\mathrm{\Xi }}_{f}\mathcal{H}\left[{\mathcal{P}}_{{S}^{-1}}\left[\mathbf{\mu }\right]\right]\phantom{\rule{0ex}{0ex}}={\mathrm{\Xi }}_{f}{S}^{-1}\mathcal{H}\left[\mathbf{\mu }\right]/\mathcal{S}\left[{S}^{-1}\mathcal{H}\left[\mathbf{\mu }\right]\right]\phantom{\rule{0ex}{0ex}}={K}^{-1}\mathcal{H}\left[\mathbf{\mu }\right],$
where the equality $\mathcal{S}\left[{S}^{-1}\mathcal{H}\left[\mathbf{\mu }\right]\right]=1$ was used.

## 5.2.

### Noncentered Pinhole Camera

Let us consider that the pinhole camera is at an arbitrary position and orientation with respect to a world coordinate system $Oxyz$ as shown in Fig. 9. The position and orientation of the camera are defined by the vector $\mathbf{t}$ and the rotation matrix $R$, respectively. Let

## Eq. (80)

$\mathbf{p}={\left[\begin{array}{ccc}x& y& z\end{array}\right]}^{T}$
be a point in the world coordinate system. Then, the point $\mathbf{p}$ is seen from the camera reference frame as

## Eq. (81)

${\mathbf{p}}_{c}={R}^{T}\left(\mathbf{p}-\mathbf{t}\right)=L\mathcal{H}\left[\mathbf{p}\right],$
where $L$ is known as the matrix of extrinsic camera parameters having the explicit form

## Eq. (82)

$L=\left[\begin{array}{cc}{R}^{T}& -{R}^{T}\mathbf{t}\end{array}\right].$
By substituting Eq. (81) into Eq. (77), the complete imaging process by a noncentered pinhole camera is given as

## Eq. (83)

$\mathbf{\mu }={\mathcal{H}}^{-1}\left[KL\mathcal{H}\left[\mathbf{p}\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[C\mathcal{H}\left[\mathbf{p}\right]\right],$
where $C=KL$ is the matrix of the camera.

## 5.3.

### Homography Matrix

In general terms, Eq. (83) describes a transformation of points $\mathbf{p}$ of the 3-D space to points of the 2-D one. A very useful transformation is obtained when $\mathbf{p}$ represents points of a plane in the 3-D space. In this case, Eq. (83) is reduced to a transformation from the 2-D space to itself.

Consider that $\mathbf{p}$ represents the points of a plane in the 3-D space; mathematically, see Eq. (53)

## Eq. (84)

$\mathbf{p}=\mathrm{\Pi }\mathcal{H}\left[\mathbf{\rho }\right],$
where $\mathbf{\rho }={\left[{\rho }_{x},{\rho }_{y}\right]}^{T}$ parameterizes the plane, $\mathrm{\Pi }=\left[{\mathbf{q}}_{1},{\mathbf{q}}_{2},\mathbf{s}\right]$ is the matrix of the plane, ${\mathbf{q}}_{1}$ and ${\mathbf{q}}_{2}$ are columns of the rotation matrix $Q={\left[{\mathbf{q}}_{1},{\mathbf{q}}_{2},{\mathbf{q}}_{3}\right]}^{T}$, and $\mathbf{s}$ is a translation vector. Next, the points $\mathbf{p}$ are transformed to $\mathbf{\mu }$ by Eq. (83) as

## Eq. (85)

$\mathbf{\mu }={\mathcal{H}}^{-1}\left[C\mathcal{H}\left[\mathrm{\Pi }\mathcal{H}\left[\mathbf{\rho }\right]\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[C\left[\begin{array}{c}\mathrm{\Pi }\\ \mathcal{H}{\left[{\mathbf{0}}_{2}\right]}^{T}\end{array}\right]\mathcal{H}\left[\mathbf{\rho }\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{H}}^{-1}\left[G\mathcal{H}\left[\mathbf{\rho }\right]\right]\phantom{\rule{0ex}{0ex}}={\mathcal{P}}_{G}\left[\mathbf{\rho }\right],$
where $G$ is known as the homography matrix and has the explicit form

## Eq. (86)

$G=KL\left[\begin{array}{c}\mathrm{\Pi }\\ \mathcal{H}\left[{\mathbf{0}}_{2}\right]\end{array}\right]\phantom{\rule{0ex}{0ex}}=K{R}^{T}\left[\begin{array}{cc}\mathrm{\Pi }& -\mathbf{t}\mathcal{H}{\left[\mathbf{0}\right]}^{T}\end{array}\right]\phantom{\rule{0ex}{0ex}}=K{R}^{T}\overline{\mathrm{\Pi }},$
where

## Eq. (87)

$\overline{\mathrm{\Pi }}=\left[\begin{array}{cc}\mathrm{\Pi }& -\mathbf{t}\mathcal{H}{\left[\mathbf{0}\right]}^{T}\end{array}\right]\phantom{\rule{0ex}{0ex}}=\left[\begin{array}{ccc}{\mathbf{q}}_{1}& {\mathbf{q}}_{2}& \mathbf{s}-\mathbf{t}\end{array}\right].$
From Eqs. (79) and (85), a point $\mathbf{\rho }$ is imaged at the point with actual image coordinates

## Eq. (88)

${\mathcal{H}}_{f}\left[\mathbf{\beta }\right]={K}^{-1}\mathcal{H}\left[\mathbf{\mu }\right]\phantom{\rule{0ex}{0ex}}={K}^{-1}\mathcal{H}\left[{\mathcal{P}}_{G}\left[\mathbf{\rho }\right]\right]\phantom{\rule{0ex}{0ex}}={K}^{-1}G\mathcal{H}\left[\mathbf{\rho }\right]/\lambda \phantom{\rule{0ex}{0ex}}={R}^{T}\overline{\mathrm{\Pi }}\mathcal{H}\left[\mathbf{\rho }\right]/\lambda ,$
where $\lambda =\mathcal{S}\left[G\mathcal{H}\left[\mathbf{\rho }\right]\right]$.

The homography matrix is singular when the pinhole is at a point of the reference plane. For any other case, $\mathrm{det}\text{\hspace{0.17em}}G={\mathbf{q}}_{3}^{T}\left(\mathbf{s}-\mathbf{t}\right)/\left({s}_{x}{s}_{y}f\right)$ and Eq. (85) can be inverted as

## Eq. (89)

$\mathbf{\rho }={\mathcal{P}}_{{G}^{-1}}\left[\mathbf{\mu }\right].$
The homography matrix is very useful for many computer vision tasks. In Appendix A, the direct linear transformation method for homography estimation is described.

## Perspective Correction for Document Scanning

A camera document scanning application performs several image processing tasks, such as quadrilateral detection, perspective correction, resampling, and image enhancement. In this section, the perspective correction task is addressed to illustrate the application of the proposed approach.

## 6.1.

### Assumptions

In Appendix A, we show that the perspective of a flat object can be easily corrected using the associated homography. For this, at least four correspondences $\left({\mathbf{\mu }}_{k},{\mathbf{\rho }}_{k}\right)$ must be provided. However, for practical document scanning, the coordinates ${\mathbf{\rho }}_{k}$ are unknown. Instead, it is assumed that the document to be digitized is rectangular and the orthogonality and parallelism properties of its edges are exploited.

The estimation of the homography is greatly simplified by assuming a centered pinhole camera with known intrinsic parameters; e.g., by a previous camera calibration, see Appendix B. Thus, we only require to estimate the reference plane parameters $\mathrm{\Pi }$, i.e., the rotation matrix $Q$ and the translation vector $\mathbf{s}$, see Eq. (84).

## 6.2.

### Estimation of the Reference Plane Parameters

Consider a coordinate system in the reference plane with origin at the center of the document to be scanned as shown in Fig. 10(a). The $x$- and $y$-axes of this coordinate system are parallel with the upper/lower and left/right sides of the paper, respectively. The corners of the document to be digitized have coordinates given by the vectors

## Eq. (90)

${\mathbf{\rho }}_{k},\phantom{\rule[-0.0ex]{2em}{0.0ex}}k=1,\cdots ,4.$
In this configuration, the vectors ${\mathbf{\rho }}_{k}$ are symmetric about the $y$-axis; that is

## Eq. (91)

${\mathbf{\rho }}_{2}=T{\mathbf{\rho }}_{1},\phantom{\rule{0ex}{0ex}}{\mathbf{\rho }}_{4}=T{\mathbf{\rho }}_{3},$
where

## Eq. (92)

$T=\left[\begin{array}{cc}-1& 0\\ 0& 1\end{array}\right].$
When the document is imaged by the camera, the original rectangle is transformed to a quadrilateral because of the perspective distortion. The corners of the imaged document have coordinates given by the vectors

## Eq. (93)

${\mathbf{\mu }}_{k},\phantom{\rule[-0.0ex]{2em}{0.0ex}}k=1,\cdots ,4,$
as shown in Fig. 10(b). The vectors ${\mathbf{\mu }}_{k}$ and ${\mathbf{\rho }}_{k}$ are related by Eqs. (85) and (89); however, the vectors ${\mathbf{\rho }}_{k}$ and the homography $G$ are unavailable. Only the vectors ${\mathbf{\mu }}_{k}$ are available, which are easily obtained from the image by pointing the vertexes of the imaged document.

## Fig. 10

(a) Depict of the rectangular paper to be digitized. (b) The quadrilateral obtained by pinhole imaging of a rectangular paper. (c) The remaining rotation after perspective correction of the quadrilateral shown in (b). The points ${\mathbf{\mu }}_{k}$ are used to compute the following lines, see Fig. 10(b),

## Eq. (94)

${\mathbf{m}}_{1}=\mathcal{H}\left[{\mathbf{\mu }}_{3}\right]×\mathcal{H}\left[{\mathbf{\mu }}_{1}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{m}}_{4}=\mathcal{H}\left[{\mathbf{\mu }}_{1}\right]×\mathcal{H}\left[{\mathbf{\mu }}_{2}\right],\phantom{\rule{0ex}{0ex}}{\mathbf{m}}_{2}=\mathcal{H}\left[{\mathbf{\mu }}_{2}\right]×\mathcal{H}\left[{\mathbf{\mu }}_{4}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{m}}_{5}=\mathcal{H}\left[{\mathbf{\mu }}_{4}\right]×\mathcal{H}\left[{\mathbf{\mu }}_{1}\right],\phantom{\rule{0ex}{0ex}}{\mathbf{m}}_{3}=\mathcal{H}\left[{\mathbf{\mu }}_{3}\right]×\mathcal{H}\left[{\mathbf{\mu }}_{4}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{m}}_{6}=\mathcal{H}\left[{\mathbf{\mu }}_{2}\right]×\mathcal{H}\left[{\mathbf{\mu }}_{3}\right].$
Next, with the lines ${\mathbf{m}}_{k}$, the following three intersection points are computed

## Eq. (95)

${\mathbf{\mu }}_{0}={\mathcal{H}}^{-1}\left[{\mathbf{m}}_{1}×{\mathbf{m}}_{2}\right],\phantom{\rule{0ex}{0ex}}{\mathbf{\mu }}_{a}={\mathcal{H}}^{-1}\left[{\mathbf{m}}_{3}×{\mathbf{m}}_{4}\right],\phantom{\rule{0ex}{0ex}}{\mathbf{\mu }}_{b}={\mathcal{H}}^{-1}\left[{\mathbf{m}}_{5}×{\mathbf{m}}_{6}\right].$
Since the intrinsic camera parameters are assumed to be known, the actual image coordinates of the points ${\mathbf{\mu }}_{i}$ can be obtained by Eq. (79) as

## Eq. (96)

${\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{i}\right]={K}^{-1}\mathcal{H}\left[{\mathbf{\mu }}_{i}\right],$
with $i=a,b,0,1,2,3,4$.

## 6.2.1.

#### Normal vector

Note that the points ${\mathbf{\mu }}_{a}$ and ${\mathbf{\mu }}_{b}$ are the projections of the ideal points

## Eq. (97)

$\mathcal{H}\left[{\mathbf{\rho }}_{a}\right]={\left[\begin{array}{ccc}1& 0& 0\end{array}\right]}^{T},\phantom{\rule{0ex}{0ex}}\mathcal{H}\left[{\mathbf{\rho }}_{b}\right]={\left[\begin{array}{ccc}0& 1& 0\end{array}\right]}^{T},$
respectively. Thus, the line ${\mathbf{m}}_{\infty }$ (in physical coordinates on the image plane $z=f$) is parallel to the normal ${\mathbf{q}}_{3}$, see Eq. (67). That is,

## Eq. (98)

$\mathbf{n}={\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{a}\right]×{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{b}\right]\phantom{\rule{0ex}{0ex}}={\mathbf{q}}_{3}/\lambda ,$
for some $\lambda$. Thus, the normal of the reference plane is obtained as the normalization of the vector $\mathbf{n}$, namely

## Eq. (99)

${\mathbf{q}}_{3}=\mathbf{n}/‖\mathbf{n}‖\phantom{\rule{0ex}{0ex}}=\frac{{K}^{T}\left[\left({\mathbf{m}}_{3}×{\mathbf{m}}_{4}\right)×\left({\mathbf{m}}_{5}×{\mathbf{m}}_{6}\right)\right]}{‖{K}^{T}\left[\left({\mathbf{m}}_{3}×{\mathbf{m}}_{4}\right)×\left({\mathbf{m}}_{5}×{\mathbf{m}}_{6}\right)\right]‖}.$

## 6.2.2.

#### Translation vector

The translation vector $\mathbf{s}$ is obtained by taking into account that ${\mathcal{P}}_{G}$ preserves the line–line intersection. Thus, from Eq. (65) we have ${\mathbf{\mu }}_{0}={\mathcal{P}}_{G}\left[{\mathbf{\rho }}_{0}\right]$ with ${\mathbf{\rho }}_{0}={\mathbf{0}}_{2}$. Therefore, from Eq. (88) we have

## Eq. (100)

${\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{0}\right]=\overline{\mathrm{\Pi }}\mathcal{H}\left[{\mathbf{0}}_{2}\right]/\xi \phantom{\rule{0ex}{0ex}}=\mathbf{s}/\xi ,$
where $\xi$ is a scalar to be determined. For this, note that the vectors

## Eq. (101)

${\mathbf{p}}_{k}={\zeta }_{k}{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{k}\right],\phantom{\rule[-0.0ex]{2em}{0.0ex}}k=1,\cdots ,4,$
are points of the reference for values ${\zeta }_{k}$ and $\xi$ such that the equation of the plane ${\mathbf{q}}_{3}\left({\mathbf{p}}_{k}-\mathbf{s}\right)=0$ is satisfied. This leads to

## Eq. (102)

${\zeta }_{k}=\xi \frac{{\mathbf{q}}_{3}^{T}{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{0}\right]}{{\mathbf{q}}_{3}^{T}{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{k}\right]}.$
Since the points ${\mathbf{\rho }}_{k}$ are on the unit circumference, see Fig. 10(a), then $‖{\mathbf{\rho }}_{k}‖=‖{\mathbf{p}}_{k}-\mathbf{s}‖=1$, which leads to

## Eq. (103)

${\xi }_{k}={‖\frac{{\mathbf{q}}_{3}^{T}{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{0}\right]}{{\mathbf{q}}_{3}^{T}{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{k}\right]}{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{k}\right]-{\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{0}\right]‖}^{-1},$
where the subindex $k$ in $\xi$ emphasizes the fact that a different value could be obtained for each ${\mathcal{H}}_{f}\left[{\mathbf{\beta }}_{k}\right]$ due to inaccuracies of ${\mathbf{\mu }}_{k}$. Therefore, the value $\xi$ is computed as

## Eq. (104)

$\xi =\text{mean}\left\{{\xi }_{1},{\xi }_{2},{\xi }_{3},{\xi }_{4}\right\}.$
The result is used in Eq. (100), and the translation of the reference plane is now available.

## 6.2.3.

#### Euler angles

The reference plane is fully characterized by six degrees of freedom (DOF), namely position (three coordinates) and orientation (three angles). The vectors ${\mathbf{q}}_{3}$ and $\mathbf{s}$ provide five DOFs. Specifically, the vector $\mathbf{s}$ provides three DOFs that fix the position while ${\mathbf{q}}_{3}$ provides two DOFs defining the orientation by the azimuth and polar angles given, respectively, by

## Eq. (105)

$\mathrm{tan}\text{\hspace{0.17em}}\varphi ={q}_{23}/{q}_{13},\phantom{\rule{0ex}{0ex}}\mathrm{cos}\text{\hspace{0.17em}}\theta ={q}_{33},$
where ${\left[{q}_{13},{q}_{23},{q}_{33}\right]}^{T}={\mathbf{q}}_{3}$ is the third column of the rotation matrix $Q$. The remaining angle $\gamma$ (the angle around the normal ${\mathbf{q}}_{3}$) can be obtained as follows.

From Eqs. (84) and (53), we have

## Eq. (106)

${\mathbf{p}}_{k}=Q{\mathcal{H}}_{0}\left[{\mathbf{\rho }}_{k}\right]+\mathbf{s},$
where the matrix $Q$ is defined as the Euler sequence

## Eq. (107)

$Q={Q}_{z}\left(\varphi \right){Q}_{y}\left(\theta \right){Q}_{z}\left(\gamma \right),$
with ${Q}_{z}$ and ${Q}_{y}$ being the rotation matrices around the $z$- and $y$-axes, respectively. Thus, using Eq. (101), the estimated vector $\mathbf{s}$, and the angles $\theta$ and $\varphi$, we compute the (perspective corrected) points

## Eq. (108)

${\mathbf{\delta }}_{k}={\mathcal{H}}_{0}^{-1}\left[{Q}_{y}^{T}\left(\theta \right){Q}_{z}{\left(\varphi \right)}^{T}\left({\mathbf{p}}_{k}-\mathbf{s}\right)\right]=\left[\begin{array}{c}{\delta }_{xk}\\ {\delta }_{yk}\end{array}\right],$
with $k=1,\cdots ,4$, see Fig. 10(c). The vectors ${\mathbf{\delta }}_{k}$ and ${\mathbf{\rho }}_{k}$ are related by

## Eq. (109)

${\mathbf{\rho }}_{k}={\overline{Q}}_{z}^{T}\left(\gamma \right){\mathbf{\delta }}_{k},$
where

## Eq. (110)

${\overline{Q}}_{z}^{T}\left(\gamma \right)=\left[\begin{array}{cc}\mathrm{cos}\text{\hspace{0.17em}}\gamma & \mathrm{sin}\text{\hspace{0.17em}}\gamma \\ -\mathrm{sin}\text{\hspace{0.17em}}\gamma & \mathrm{cos}\text{\hspace{0.17em}}\gamma \end{array}\right].$
The vectors ${\mathbf{\rho }}_{k}$ are unavailable, but we use their symmetry properties given in Eq. (91) to obtain

## Eq. (111)

${\overline{Q}}_{z}^{T}\left(\gamma \right){\mathbf{\delta }}_{2}=T{\overline{Q}}_{z}^{T}\left(\gamma \right){\mathbf{\delta }}_{1},\phantom{\rule{0ex}{0ex}}{\overline{Q}}_{z}^{T}\left(\gamma \right){\mathbf{\delta }}_{4}=T{\overline{Q}}_{z}^{T}\left(\gamma \right){\mathbf{\delta }}_{3}.$
The product ${\overline{Q}}_{z}^{T}\left(\gamma \right){\mathbf{\delta }}_{k}$ can be written as

## Eq. (112)

${\overline{R}}^{T}{\mathbf{\delta }}_{k}=\mathcal{B}\left[{\mathbf{\delta }}_{k}\right]\mathrm{\Gamma },$
where $\mathrm{\Gamma }={\left[\mathrm{sin}\text{\hspace{0.17em}}\gamma ,\mathrm{cos}\text{\hspace{0.17em}}\gamma \right]}^{T}$ and

## Eq. (113)

$\mathcal{B}\left[{\mathbf{\delta }}_{k}\right]=\left[\begin{array}{cc}{\delta }_{yk}& {\delta }_{xk}\\ -{\delta }_{xk}& {\delta }_{yk}\end{array}\right].$
Thus, Eq. (111) can be rewritten as

## Eq. (114)

$\mathbb{B}\mathrm{\Gamma }={\mathbf{0}}_{4},$
where

## Eq. (115)

$\mathbb{B}=\left[\begin{array}{c}\mathcal{B}\left[{\mathbf{\delta }}_{2}\right]-T\mathcal{B}\left[{\mathbf{\delta }}_{1}\right]\\ \mathcal{B}\left[{\mathbf{\delta }}_{4}\right]-T\mathcal{B}\left[{\mathbf{\delta }}_{3}\right]\end{array}\right].$
The nontrivial solution of Eq. (114) for $\mathrm{\Gamma }$ is obtained as the right-singular vector corresponding to the smallest singular value of $\mathbb{B}$. Finally, the angle $\gamma$ is obtained from $\mathrm{\Gamma }$ by

## Eq. (116)

$\mathrm{tan}\text{\hspace{0.17em}}\gamma ={\mathcal{H}}^{-1}\left[\mathrm{\Gamma }\right].$
The estimated parameters are used to create the matrix $\mathrm{\Pi }$. Then, the required homography $G$ is obtained by Eq. (86) (with $\mathbf{t}={\mathbf{0}}_{3}$ and $R=\mathbb{I}$ because of the centered pinhole camera configuration). Finally, the perspective distortion of the image is corrected by displaying the intensity of each pixel of the image at the point $\mathbf{\rho }$ computed by Eq. (89).

## 6.3.

### Illustrative Example

The functionality of the presented algorithm is illustrated by the following example. The camera described in Appendix B and the estimated intrinsic parameters $K$ given in Eq. (156) are used here.

Figure 11(a) shows the image of a rectangular object acquired by the camera. Then, the four corners of the quadrilateral are marked from the image as shown by the yellow circles in Fig. 11(b). The points ${\mathbf{\mu }}_{0}$, ${\mathbf{\mu }}_{a}$, and ${\mathbf{\mu }}_{b}$ are indicated by the red circles in Fig. 11(b). It is worth mentioning that ${\mathbf{\mu }}_{a}$, or ${\mathbf{\mu }}_{b}$, or both could be points at infinity. Even in these cases, the presented methodology is valid.

## Fig. 11

(a) An input image with a rectangular object in scene. (b) The corners of the rectangle are marked by yellow circles. (c) Corrected image. (d) Zoom of (c) highlighting the region of interest. The information estimated with the four corners are

## Eq. (117)

$\mathbf{s}={\left[\begin{array}{ccc}-0.2289& 0.0561& 2.9236\end{array}\right]}^{T},\phantom{\rule{0ex}{0ex}}\varphi =0.6776,\phantom{\rule{0ex}{0ex}}\theta =0.9879,\phantom{\rule{0ex}{0ex}}\gamma =2.3041.$
With these parameters, the matrix of the reference plane is

## Eq. (118)

$\mathrm{\Pi }=\left[\begin{array}{ccc}-0.7528& 0.1010& -0.2289\\ 0.3478& -0.7779& 0.0561\\ 0.5588& 0.6202& 2.9236\end{array}\right].$
Thus, the resulting homography is

## Eq. (119)

$G=\left[\begin{array}{ccc}-2.0219& 0.2429& -0.7272\\ 0.9237& -2.0786& 0.1298\\ 0.5588& 0.6202& 2.9236\end{array}\right].$
All points $\mathbf{\mu }$ of the image are transformed to points $\mathbf{\rho }$ of the reference plane by Eq. (89). Next, the pixels of the image are displayed at the points $\mathbf{\rho }$ as shown in Fig. 11(c).

With the correction of perspective, the yellow circles in Fig. 11(b) become the green ones in Fig. 11(c). The region of interest is the rectangle with corners marked by green circles in Fig. 11(c). Finally, a zoom of the region of interest is shown in Fig. 11(d).

## Conclusions

An operator-based approach for homogeneous coordinates was proposed. Several basic geometrical concepts and properties of the operators were investigated. With the proposed approach, the pinhole camera model and a simple camera calibration method were described. The study of this work was motivated by developing a perspective correction method useful for a camera document scanning application. Several experimental results illustrate the analyzed theoretical aspects. The proposed approach could be a good starting point to introduce inexperienced students in the scientific discipline of computer vision.

## Appendix A:

### Estimation of the Homography Matrix

In this appendix, we illustrate the method known as direct linear transformation for homography matrix estimation. This method is very useful for illustration purposes because of its simplicity. However, the highest accuracy and robustness are reached with other advanced methods available in the literature.9,13

Let $G$ be the homography matrix defined in Eq. (86). Consider that the matrix $G$ is row partitioned as follows:

## Eq. (120)

$G=\left[\begin{array}{ccc}{g}_{11}& {g}_{12}& {g}_{13}\\ {g}_{21}& {g}_{22}& {g}_{23}\\ {g}_{31}& {g}_{32}& {g}_{33}\end{array}\right]=\left[\begin{array}{c}{\overline{\mathbf{g}}}_{1}^{T}\\ {\overline{\mathbf{g}}}_{2}^{T}\\ {\overline{\mathbf{g}}}_{3}^{T}\end{array}\right],$
where

## Eq. (121)

${\overline{\mathbf{g}}}_{1}^{T}=\left[\begin{array}{ccc}{g}_{11}& {g}_{12}& {g}_{13}\end{array}\right],\phantom{\rule{0ex}{0ex}}{\overline{\mathbf{g}}}_{2}^{T}=\left[\begin{array}{ccc}{g}_{21}& {g}_{22}& {g}_{23}\end{array}\right],\phantom{\rule{0ex}{0ex}}{\overline{\mathbf{g}}}_{3}^{T}=\left[\begin{array}{ccc}{g}_{31}& {g}_{32}& {g}_{33}\end{array}\right].$

Equation (85), which relates points of the reference and image planes, can be rewritten as

## Eq. (122)

$\mathbf{\mu }=\left[\begin{array}{c}u\\ v\end{array}\right]=\frac{1}{{\overline{\mathbf{g}}}_{3}^{T}\mathcal{H}\left[\rho \right]}\left[\begin{array}{c}{\overline{\mathbf{g}}}_{1}^{T}\mathcal{H}\left[\mathbf{\rho }\right]\\ {\overline{\mathbf{g}}}_{2}^{T}\mathcal{H}\left[\mathbf{\rho }\right]\end{array}\right],$
or

## Eq. (123)

$\left[\begin{array}{c}u{\overline{\mathbf{g}}}_{3}^{T}\mathcal{H}\left[\mathbf{\rho }\right]\\ v{\overline{\mathbf{g}}}_{3}^{T}\mathcal{H}\left[\mathbf{\rho }\right]\end{array}\right]=\left[\begin{array}{c}{\overline{\mathbf{g}}}_{1}^{T}\mathcal{H}\left[\mathbf{\rho }\right]\\ {\overline{\mathbf{g}}}_{2}^{T}\mathcal{H}\left[\mathbf{\rho }\right]\end{array}\right].$

Furthermore, Eq. (123) can be written in matrix form as

## Eq. (124)

$A\overline{\mathbf{g}}={\mathbf{0}}_{2},$
where

## Eq. (125)

$A=\left[\begin{array}{ccc}\mathcal{H}{\left[\mathbf{\rho }\right]}^{T}& {\mathbf{0}}_{3}^{T}& -u\mathcal{H}{\left[\mathbf{\rho }\right]}^{T}\\ {\mathbf{0}}_{3}^{T}& \mathcal{H}{\left[\mathbf{\rho }\right]}^{T}& -v\mathcal{H}{\left[\mathbf{\rho }\right]}^{T}\end{array}\right],\phantom{\rule{0ex}{0ex}}\overline{\mathbf{g}}={\left[\begin{array}{ccc}{\overline{\mathbf{g}}}_{1}^{T}& {\overline{\mathbf{g}}}_{2}^{T}& {\overline{\mathbf{g}}}_{3}^{T}\end{array}\right]}^{T}.$

Equation (124) relates a single point $\mathbf{\rho }$ on the reference plane with the corresponding point $\mathbf{\mu }$ on the image plane. If $n$ pairs $\left({\mathbf{\rho }}_{k},{\mathbf{\mu }}_{k}\right)$, with $k=1,2,\cdots n$, are available, the $n$ corresponding equations of the form Eq. (124) can be written as

## Eq. (126)

$\mathbb{A}\overline{\mathbf{g}}={\mathbf{0}}_{2n},$
where

## Eq. (127)

$\mathbb{A}={\left[\begin{array}{cccc}{A}_{1}^{T}& {A}_{2}^{T}& \cdots & {A}_{n}^{T}\end{array}\right]}^{T},\phantom{\rule{0ex}{0ex}}{A}_{k}=\left[\begin{array}{ccc}\mathcal{H}{\left[{\mathbf{\rho }}_{k}\right]}^{T}& {\mathbf{0}}_{3}^{T}& -{u}_{k}\mathcal{H}{\left[{\mathbf{\rho }}_{k}\right]}^{T}\\ {\mathbf{0}}_{3}^{T}& \mathcal{H}{\left[{\mathbf{\rho }}_{k}\right]}^{T}& -{v}_{k}\mathcal{H}{\left[{\mathbf{\rho }}_{k}\right]}^{T}\end{array}\right].$

The nontrivial solution $\overline{\mathbf{g}}$ of Eq. (126) can be obtained using the constraint $‖\overline{\mathbf{g}}‖=1$. Thus, by using the singular value decomposition of $\mathbb{A}$, the solution for $\overline{\mathbf{g}}$ is the right-singular vector corresponding to the smallest singular value of $\mathbb{A}$, see Appendix C of Ref. 14.

The application of this method is illustrated as follows. Consider the image shown in Fig. 12(a). A letter size paper printed with the Melencolia I by Albrecht Dürer is in the scene. Using the aspect ratio $1:1.2941$ of the letter paper, the coordinates of the corners are fixed to

## Eq. (128)

${\mathbf{\rho }}_{1}={\left[1,1.2941\right]}^{T},\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{\rho }}_{3}=-{\mathbf{\rho }}_{1},\phantom{\rule{0ex}{0ex}}{\mathbf{\rho }}_{2}={\left[-1,1.2941\right]}^{T},\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{\rho }}_{4}=-{\mathbf{\rho }}_{2}.$

The coordinates of the imaged corners are

## Eq. (129)

${\mathbf{\mu }}_{1}={\left[-0.2858,0.5661\right]}^{T},\phantom{\rule{0ex}{0ex}}{\mathbf{\mu }}_{2}={\left[0.3826,-0.0938\right]}^{T},\phantom{\rule{0ex}{0ex}}{\mathbf{\mu }}_{3}={\left[-0.2884,-0.5403\right]}^{T},\phantom{\rule{0ex}{0ex}}{\mathbf{\mu }}_{4}={\left[-0.8479,-0.1135\right]}^{T},$
see yellow circles in Fig. 12(a). With these four pairs $\left({\mathbf{\rho }}_{k},{\mathbf{\mu }}_{k}\right)$, we obtain the homography

## Eq. (130)

$G=\left[\begin{array}{ccc}-0.2437& 0.2292& -0.2442\\ 0.2258& 0.1870& -0.0888\\ -0.0524& -0.0989& 0.8497\end{array}\right].$

The homography $G$ fully defines a pinhole imaging process. Thus, it can be inversed to obtain an undistorted view of the reference plane from its perspective distorted image. Specifically, using Eq. (89) all points $\mathbf{\mu }$ of the image are transformed to points $\mathbf{\rho }$ of the reference plane. Then, the pixels of the image are displayed at the points $\mathbf{\rho }$ as shown in Fig. 12(b). Note that corners of the paper in the corrected image are at the coordinates specified by Eq. (128).

The least number of point correspondences for two-dimensional homography estimation is four. However, the accuracy of the estimation is improved when more than four point correspondences are provided. For this reason, checkerboard patterns15 and gratings16,17 are useful target objects. In this appendix, the corner points of the imaged rectangle where obtained manually from the image. However, the corner points can be obtained automatically using checkerboard patterns or gratings along with grid detection18 or phase demodulation,19 respectively.

## Appendix B:

### Camera Parameters from Homographies

The homography matrix involves both intrinsic $K$ and extrinsic $L$ camera parameters as well as the reference plane parameters $\mathrm{\Pi }$. In this appendix, we show how to obtain the intrinsic and extrinsic camera parameters from several homographies.

## B.1.

#### Intrinsic Camera Parameters

Consider that the reference plane is the $xy$-plane of the world coordinate system; i.e., $\mathbf{s}={\mathbf{0}}_{3}$ and

## Eq. (131)

$\overline{Q}=\left[\begin{array}{cc}1& 0\\ 0& 1\\ 0& 0\end{array}\right].$

In this case, the homography $G$, defined in Eq. (86), is reduced to

## Eq. (132)

$G=K\left[\begin{array}{ccc}{\overline{\mathbf{r}}}_{1}& {\overline{\mathbf{r}}}_{2}& -{R}^{T}\mathbf{t}\end{array}\right],$
where ${\overline{\mathbf{r}}}_{1}^{T}$ and ${\overline{\mathbf{r}}}_{2}^{T}$ are the first and second rows of the rotation matrix $R$, respectively. Consider that the matrix $G$ is column partitioned as follows:

## Eq. (133)

$G=\left[\begin{array}{ccc}{g}_{11}& {g}_{12}& {g}_{13}\\ {g}_{21}& {g}_{22}& {g}_{23}\\ {g}_{31}& {g}_{32}& {g}_{33}\end{array}\right]=\left[\begin{array}{ccc}{\mathbf{g}}_{1}& {\mathbf{g}}_{2}& {\mathbf{g}}_{\mathbf{3}}\end{array}\right],$
where

## Eq. (134)

${\mathbf{g}}_{1}=\left[\begin{array}{c}{g}_{11}\\ {g}_{21}\\ {g}_{31}\end{array}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{g}}_{2}=\left[\begin{array}{c}{g}_{12}\\ {g}_{22}\\ {g}_{32}\end{array}\right],\phantom{\rule[-0.0ex]{1em}{0.0ex}}{\mathbf{g}}_{3}=\left[\begin{array}{c}{g}_{13}\\ {g}_{23}\\ {g}_{33}\end{array}\right].$

Thus, Eq. (132) can be written as

## Eq. (135)

$\left[\begin{array}{ccc}{\overline{\mathbf{r}}}_{1}& {\overline{\mathbf{r}}}_{2}& -{R}^{T}\mathbf{t}\end{array}\right]={K}^{-1}\left[\begin{array}{ccc}{\mathbf{g}}_{1}& {\mathbf{g}}_{2}& {\mathbf{g}}_{3}\end{array}\right].$

Since ${\overline{\mathbf{r}}}_{1}$ and ${\overline{\mathbf{r}}}_{2}$ are orthonormal vectors (${\overline{\mathbf{r}}}_{1}$ and ${\overline{\mathbf{r}}}_{2}$ are rows of a rotation matrix), we have the following two constraints ${\overline{\mathbf{r}}}_{1}^{T}{\overline{\mathbf{r}}}_{2}=0$ and ${‖{\overline{\mathbf{r}}}_{1}‖}^{2}={‖{\overline{\mathbf{r}}}_{2}‖}^{2}$, which can be written as

## Eq. (136)

${\mathbf{g}}_{1}^{T}W{\mathbf{g}}_{2}=0,\phantom{\rule{0ex}{0ex}}{\mathbf{g}}_{1}^{T}W{\mathbf{g}}_{1}={\mathbf{g}}_{2}^{T}W{\mathbf{g}}_{2},$
where the symmetric matrix $W$ is defined as

## Eq. (137)

$W={K}^{-T}{K}^{-1}=\left[\begin{array}{ccc}{w}_{11}& {w}_{12}& {w}_{13}\\ {w}_{12}& {w}_{22}& {w}_{23}\\ {w}_{13}& {w}_{23}& {w}_{33}\end{array}\right].$

The bilinear form ${\mathbf{g}}_{i}^{T}W{\mathbf{g}}_{j}$ can be rewritten as

## Eq. (138)

${\mathbf{g}}_{i}^{T}W{\mathbf{g}}_{j}={\mathcal{V}}_{ij}\left[G\right]\mathbf{w},$
where

## Eq. (139)

${\mathcal{V}}_{ij}\left[G\right]={\left[\begin{array}{c}{g}_{1i}{g}_{1j}\\ {g}_{2i}{g}_{2j}\\ {g}_{3i}{g}_{3j}\\ {g}_{2i}{g}_{1j}+{g}_{1i}{g}_{2j}\\ {g}_{3i}{g}_{1j}+{g}_{1i}{g}_{3j}\\ {g}_{3i}{g}_{2j}+{g}_{2i}{g}_{3j}\end{array}\right]}^{T},$
and

## Eq. (140)

$\mathbf{w}={\left[\begin{array}{cccccc}{w}_{11}& {w}_{22}& {w}_{33}& {w}_{12}& {w}_{13}& {w}_{23}\end{array}\right]}^{T}.$

Then, the constraints given by Eq. (136) become

## Eq. (141)

$V\left[G\right]\mathbf{w}={\mathbf{0}}_{2},$
where $V\left[G\right]$ is the following $2×6$ matrix:

## Eq. (142)

$V\left[G\right]=\left[\begin{array}{c}{\mathcal{V}}_{12}\left[G\right]\\ {\mathcal{V}}_{11}\left[G\right]-{\mathcal{V}}_{22}\left[G\right]\end{array}\right].$

A nontrivial solution of Eq. (141) for $\mathbf{w}$ can be obtained using several homographies ${G}_{k}$, $k=1,2,\cdots ,m$. For this, we compute the homographies of different images where the position and orientation of the reference plane (or the camera, or both) are varying (in an unknown manner) while the intrinsic camera parameters remain constant. Thus, we solve the new matrix equation

## Eq. (143)

$\mathbb{V}\mathbf{w}={\mathbf{0}}_{2m},$
where

## Eq. (144)

$\mathbb{V}={\left[\begin{array}{cccc}V{\left[{G}_{1}\right]}^{T}& V{\left[{G}_{2}\right]}^{T}& \cdots & V{\left[{G}_{m}\right]}^{T}\end{array}\right]}^{T}.$

In general, at least three homographies ($m=3$) are required. However, two homographies are sufficient assuming zero-skew.

Equation (143) can be solved for $\mathbf{w}$ using the singular value decomposition method, see Appendix C of Ref. 14. Since the obtained solution, labeled as $\stackrel{˜}{\mathbf{w}}$, is unique up to scale, the associated matrix $\stackrel{˜}{W}$ is related to $W$ by

## Eq. (145)

$\stackrel{˜}{W}=\lambda W=\lambda {K}^{-T}{K}^{-1},$
where $\lambda \ne 0$ is an unknown constant. With the estimated matrix $\stackrel{˜}{W}$, the unknown scalar $\lambda$ and the entries ${k}_{ij}$ of the intrinsic parameter matrix

## Eq. (146)

$K=\left[\begin{array}{ccc}{k}_{11}& {k}_{12}& {k}_{13}\\ 0& {k}_{22}& {k}_{23}\\ 0& 0& 1\end{array}\right],$
are given in closed form as

## Eq. (147)

$\lambda =\left(\mathrm{det}\text{\hspace{0.17em}}\stackrel{˜}{W}\right)/d,\phantom{\rule{0ex}{0ex}}{k}_{11}=\sqrt{\lambda /{\stackrel{˜}{w}}_{11}},\phantom{\rule{0ex}{0ex}}{k}_{22}=\sqrt{\lambda {\stackrel{˜}{w}}_{11}/d},\phantom{\rule{0ex}{0ex}}{k}_{12}=-{\stackrel{˜}{w}}_{12}\sqrt{\lambda /{\stackrel{˜}{w}}_{11}d},\phantom{\rule{0ex}{0ex}}{k}_{13}=\left({\stackrel{˜}{w}}_{12}{\stackrel{˜}{w}}_{23}-{\stackrel{˜}{w}}_{22}{\stackrel{˜}{w}}_{13}\right)/d,\phantom{\rule{0ex}{0ex}}{k}_{23}=\left({\stackrel{˜}{w}}_{12}{\stackrel{˜}{w}}_{13}-{\stackrel{˜}{w}}_{11}{\stackrel{˜}{w}}_{23}\right)/d,$
where $d={\stackrel{˜}{w}}_{11}{\stackrel{˜}{w}}_{22}-{\stackrel{˜}{w}}_{12}^{2}$.

It is worth mentioning that the intrinsic camera parameters ($f$, ${s}_{x}$, ${s}_{y}$, ${\tau }_{x}$, ${\tau }_{y}$, and $\sigma$) cannot be obtained using only the matrix $K$. Fortunately, the matrix $K$ is sufficient for many computer vision tasks. For the case where the intrinsic camera parameters are required explicitly, we can assume that the skew and size of the pixel are known (e.g., ${s}_{x}$, ${s}_{y}$, and $\sigma$ are consulted in the datasheet of the camera sensor). Thus, the estimation of the remaining intrinsic parameters is a linear problem with the least-squares solution

## Eq. (148)

$f={s}_{x}{s}_{y}\frac{{s}_{x}{k}_{22}+{s}_{y}{k}_{11}+{s}_{x}{s}_{y}\sigma {k}_{12}}{{s}_{x}^{2}+{s}_{y}^{2}+{s}_{x}^{2}{s}_{y}^{2}{\sigma }^{2}},$

## Eq. (149)

${\tau }_{x}={s}_{x}{k}_{13},$

## Eq. (150)

${\tau }_{y}={s}_{y}{k}_{23}.$

## B.2.

#### Extrinsic Camera Parameters

Once the matrix $K$ is available, the rotation matrix $R$ and the translation vector $\mathbf{t}$ can be estimated for each provided homography as follows. First, we compute the estimate ${\stackrel{˜}{R}}^{T}$ of the matrix ${R}^{T}$ as

## Eq. (151)

${\stackrel{˜}{R}}^{T}=\left[\begin{array}{ccc}{\mathbf{h}}_{1}& {\mathbf{h}}_{2}& {\mathbf{h}}_{1}×{\mathbf{h}}_{2}\end{array}\right],$
where using Eq. (135), the vectors ${\mathbf{h}}_{1}$ and ${\mathbf{h}}_{2}$ are given as

## Eq. (152)

${\mathbf{h}}_{1}={K}^{-1}{\mathbf{g}}_{1},\phantom{\rule{0ex}{0ex}}{\mathbf{h}}_{2}={K}^{-1}{\mathbf{g}}_{2}.$

Then, the rotation matrix $R$ is obtained from $\stackrel{˜}{R}$ ensuring the orthogonality condition of rotation matrices. For this, the singular value decomposition $\stackrel{˜}{R}=U\mathrm{\Sigma }{V}^{T}$ is obtained and the required rotation matrix is determined as

## Eq. (153)

$R=U{V}^{T}.$

Finally, the translation vector $\mathbf{t}$ is computed as

## Eq. (154)

$\mathbf{t}=-R{K}^{-1}{\mathbf{g}}_{3}.$

## B.3.

#### Illustrative Example

As an example, we describe a simple experiment to obtain the intrinsic parameters of a camera. A camera with a pixel size of $6\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$ (square pixel), resolution of $752×480\text{\hspace{0.17em}\hspace{0.17em}}\text{pixel}$, and imaging lens with focal length of 6 mm was used. The $3×3$ checkerboard pattern shown in Fig. 13(a) was printed on a letter paper. Then, 15 images of the printed pattern lying on the reference plane were captured from different unknown viewpoints, see Figs. 13(b)13(i).

We use the coordinates of the corners shown in Fig. 13(a) as the known points ${\mathbf{\rho }}_{k}$ on the reference plane. The corresponding points ${\mathbf{\mu }}_{k}$ in the image plane were obtained by marking the corners of the checkerboard pattern in the image. Then, with the pairs $\left({\mathbf{\rho }}_{k},{\mathbf{\mu }}_{k}\right)$, an homography matrix ${G}_{k}$ was computed for each acquired image. With these homographies, the matrix $\mathbb{V}$ defined in Eq. (144) was created. Then, Eq. (143) was solved for $\mathbf{w}$, the resulting matrix $\stackrel{˜}{W}$ is

## Eq. (155)

$\stackrel{˜}{W}=\left[\begin{array}{ccc}-0.1389& 0.0005& -0.0058\\ 0.0005& -0.1378& -0.0008\\ -0.0058& -0.0008& -0.9806\end{array}\right].$

From this, the intrinsic parameter matrix $K$ was recovered as

## Eq. (156)

$K=\left[\begin{array}{ccc}2.6563& -0.0103& -0.0419\\ 0& 2.6674& -0.0059\\ 0& 0& 1\end{array}\right].$

For validation purposes, we estimate the focal length using the known information ${s}_{x}={s}_{y}=6\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$, and $\sigma =0$. The reader should note that the quantities ${s}_{x}$ and ${s}_{y}$ are defined in this experiment as

## Eq. (157)

${s}_{x}={s}_{y}=\frac{752}{2}6×{10}^{-3}\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$
because the points $\mathbf{\mu }$ were obtained in a coordinate system with a unit of length equal to a half of the image width, see Figs. 13(b)13(i). From the matrix $K$ in Eq. (156), the focal length was estimated using Eq. (148). The result is $f=6.0032\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$, which is very close to the nominal focal length (6 mm) of the employed camera lens.

## Fig. 12

(a) Image of $1168×2080\text{\hspace{0.17em}\hspace{0.17em}}\text{pixel}$ capturing a scene with the Melencolia I printed on a letter size paper. (b) Perspective corrected Image. ## Fig. 13

(a) A $3×3$ checkerboard pattern and the coordinates of the corners. (b)–(i) 8 of 15 images of the reference plane acquired by a camera from different unknown viewpoints. ## Acknowledgments

This work was supported by CONACyT México through the project Cátedras/880.

## References

1.

O. Faugeras, Q.-T. Luong and T. Papadopoulo, The Geometry of Multiple Images: the Laws that Govern the Formation of Multiple Images of a Scene and Some of Their Applications, MIT Press, Cambridge (2004). Google Scholar

2.

O. D. Faugeras and S. Maybank, “Motion from point matches: multiplicity of solutions,” Int. J. Comput. Vision, 4 (3), 225 –246 (1990). http://dx.doi.org/10.1007/BF00054997 IJCVEQ 0920-5691 Google Scholar

3.

Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell., 22 1330 –1334 (2000). http://dx.doi.org/10.1109/34.888718 ITPIDJ 0162-8828 Google Scholar

4.

Y. Zhao and Y. Li, “Camera self-calibration from projection silhouettes of an object in double planar mirrors,” J. Opt. Soc. Am. A, 34 696 –707 (2017). http://dx.doi.org/10.1364/JOSAA.34.000696 JOAOD6 0740-3232 Google Scholar

5.

T. Taketomi et al., “Camera pose estimation under dynamic intrinsic parameter change for augmented reality,” Comput. Graphics, 44 11 –19 (2014). http://dx.doi.org/10.1016/j.cag.2014.07.003 Google Scholar

6.

H. H. Ip and Y. Chen, “Planar rectification by solving the intersection of two circles under 2D homography,” Pattern Recognit., 38 (7), 1117 –1120 (2005). http://dx.doi.org/10.1016/j.patcog.2004.12.004 PTNRA8 0031-3203 Google Scholar

7.

B. Cyganek and J. P. Siebert, An Introduction to 3D Computer Vision Techniques and Algorithms, John Wiley & Sons Ltd., Chichester, West Sussex (2009). Google Scholar

8.

O. Faugeras, Three-Dimensional Computer Vision: a Geometric Viewpoint, MIT Press, Cambridge (1993). Google Scholar

9.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, Cambridge (2003). Google Scholar

10.

J. L. Mundy and A. Zisserman, Appendix—Projective Geometry for Machine Vision, 463 –519 MIT Press, Cambridge (1992). Google Scholar

11.

W. Burger, “Zhang’s camera calibration algorithm: in-depth tutorial and implementation,” Hagenberg, Austria (2016). Google Scholar

12.

F. Devernay and O. Faugeras, “Straight lines have to be straight,” Mach. Vision Appl., 13 (1), 14 –24 (2001). http://dx.doi.org/10.1007/PL00013269 MVAPEO 0932-8092 Google Scholar

13.

H. Zeng, X. Deng and Z. Hu, “A new normalized method on line-based homography estimation,” Pattern Recognit. Lett., 29 (9), 1236 –1244 (2008). http://dx.doi.org/10.1016/j.patrec.2008.01.031 PRLEDG 0167-8655 Google Scholar

14.

Z. Zhang, “A flexible new technique for camera calibration,” (1998). Google Scholar

15.

L. Kr, “Accurate chequerboard corner localisation for camera calibration,” Pattern Recognit. Lett., 32 (10), 1428 –1435 (2011). http://dx.doi.org/10.1016/j.patrec.2011.04.002 PRLEDG 0167-8655 Google Scholar

16.

R. Juarez-Salazar et al., “Camera calibration by multiplexed phase encoding of coordinate information,” Appl. Opt., 54 4895 –4906 (2015). http://dx.doi.org/10.1364/AO.54.004895 APOPAI 0003-6935 Google Scholar

17.

R. Juarez-Salazar, L. N. Gaxiola and V. H. Diaz-Ramirez, “Single-shot camera position estimation by crossed grating imaging,” Opt. Commun., 382 585 –594 (2017). http://dx.doi.org/10.1016/j.optcom.2016.08.041 OPCOB8 0030-4018 Google Scholar

18.

A. Herout, M. Dubská and J. Havel, Vanishing Points, Parallel Lines, Grids, 41 –54 Springer, London (2013). Google Scholar

19.

R. Juarez-Salazar, F. Guerrero-Sanchez and C. Robledo-Sanchez, “Theory and algorithms of an efficient fringe analysis technology for automatic measurement applications,” Appl. Opt., 54 5364 –5374 (2015). http://dx.doi.org/10.1364/AO.54.005364 APOPAI 0003-6935 Google Scholar

Biographies for the authors are not available.

© 2017 Society of Photo-Optical Instrumentation Engineers (SPIE)
Rigoberto Juarez-Salazar and Victor H. Díaz-Ramírez "Operator-based homogeneous coordinates: application in camera document scanning," Optical Engineering 56(7), 070801 (27 July 2017). https://doi.org/10.1117/1.OE.56.7.070801
Received: 11 May 2017; Accepted: 7 July 2017; Published: 27 July 2017
JOURNAL ARTICLE
16 PAGES SHARE