22

Could someone, in plain english, explain the distinction between the fundamental matrix and the essential matrix in multi-view computer vision?

How are they different, and how can each be used in computing the 3D position of a point imaged from multiple views?

s-low
  • 323
  • 1
  • 2
  • 5

3 Answers3

20

Both matrices relate corresponding points in two images. The difference is that in the case of the Fundamental matrix, the points are in pixel coordinates, while in the case of the Essential matrix, the points are in "normalized image coordinates". Normalized image coordinates have the origin at the optical center of the image, and the $x$ and $y$ coordinates are normalized by the focal length in pixels: $f_x$ and $f_y$ respectively, so that they are dimensionless.

The two matrices are related as follows: $E = K^TFK$, where $K$ is the intrinsic matrix of the camera.

Note that this is a special case, where both images have been taken with the same camera. If the images were taken with different cameras, then you would have two different sets of intrinsics: $K$ and $K'$. Then $E = (K')^TFK$.

$F$ has 7 degrees of freedom, while $E$ has 5 degrees of freedom, because it takes the camera parameters into account. That's why there is an 8-point algorithm for computing the fundamental matrix and a 5-point algorithm for computing the essential matrix.

One way to get 3D camera motion from matching points from a pair of images images is to estimate the fundamental matrix, compute the essential matrix, and then to get the rotation and translation between the cameras from the essential matrix. This, of course, assumes that you know the intrinsics of your camera. Also, this would give you up-to-scale motion, with the translation being a unit vector.

Dima
  • 608
  • 4
  • 10
7

I want to add one thing to @Dima's answer, just so things are absolutely clear:

$E = (K')^TFK$ (note the transpose)

You can see how this also captures that the fundamental matrix takes image coordinates as inputs and not normalized 3D points. Let $x$, $x'$ be 2 normalized 3D points:

$(x')^T*E*x = (x')^T(K')^TFKx = (K'x')^TF(Kx)$

I can't comment yet because I don't have 50 reputation ...

f.k
  • 71
  • 1
  • 2
5

The other answer does not emphasize some important points.

  1. the fundamental matrix $F$ is a generalization of the essential matrix $E$
  2. the fundamental matrix does not assume the cameras to be calibrated (i.e. you don't know $M$ and $M'$, which are the homographies needed to project a 3d point onto the image plane of camera $C$ and $C'$, respectively)
  3. the essential matrix was discovered before the fundamental matrix
  4. in principle, to estimate the fundamental matrix you need more point-to-point correspondences than to estimate the essential matrix (because the fundamental matrix has more degrees of freedom, i.e. more parameters you need to find)