Setting up the app

Going forward, we will be using a famous open source dataset called fountain-P11. It depicts a Swiss fountain viewed from various angles. An example of this is shown in the following image:

Setting up the app

The dataset consists of 11 high-resolution images and can be downloaded from http://cvlabwww.epfl.ch/data/multiview/denseMVS.html. Had we taken the pictures ourselves, we would have had to go through the entire camera calibration procedure to recover the intrinsic camera matrix and the distortion coefficients. Luckily, these parameters are known for the camera that took the fountain dataset, so we can go ahead and hardcode these values in our code.

The main function routine

Our main function routine will consist of creating and interacting with an instance of the SceneReconstruction3D class. This code can be found in the chapter4.py file, which imports all the necessary modules and instantiates the class:

import numpy as np

from scene3D import SceneReconstruction3D


def main():
    # camera matrix and distortion coefficients
    # can be recovered with calibrate.py
    # but the examples used here are already undistorted, taken 
    # with a camera of known K
    K = np.array([[2759.48/4, 0, 1520.69/4, 0, 2764.16/4,1006.81/4, 0, 0, 1]]).reshape(3, 3)
    d = np.array([0.0, 0.0, 0.0, 0.0, 0.0]).reshape(1, 5)

Here, the K matrix is the intrinsic camera matrix for the camera that took the fountain dataset. According to the photographer, these images are already distortion free, so we set all the distortion coefficients (d) to zero.

Note

Note that if you want to run the code presented in this chapter on a dataset other than fountain-P11, you will have to adjust the intrinsic camera matrix and the distortion coefficients.

Next, we load a pair of images to which we would like to apply our structure-from-motion techniques. I downloaded the dataset into a subdirectory called fountain_dense:

# load a pair of images for which to perform SfM
scene = SceneReconstruction3D(K, d)
scene.load_image_pair("fountain_dense/0004.png", "fountain_dense/0005.png")

Now we are ready to perform various computations, such as the following:

scene.plot_optic_flow()
scene.draw_epipolar_lines()
scene.plot_rectified_images()

# draw 3D point cloud of fountain
# use "pan axes" button in pyplot to inspect the cloud (rotate 
# and zoom to convince you of the result)
scene.plot_point_cloud()

The next sections will explain these functions in detail.

The SceneReconstruction3D class

All of the relevant 3D scene reconstruction code for this chapter can be found as part of the SceneReconstruction3D class in the scene3D module. Upon instantiation, the class stores the intrinsic camera parameters to be used in all subsequent calculations:

import cv2
import numpy as np
import sys

from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt


class SceneReconstruction3D:
    def __init__(self, K, dist):
        self.K = K
        self.K_inv = np.linalg.inv(K)
        self.d = dist

Then, the first step is to load a pair of images on which to operate:

def load_image_pair(self, img_path1, img_path2,downscale=True):
    self.img1 = cv2.imread(img_path1, cv2.CV_8UC3)
    self.img2 = cv2.imread(img_path2, cv2.CV_8UC3)

    # make sure images are valid
    if self.img1 is None:
        sys.exit("Image " + img_path1 + " could not be loaded.")
    if self.img2 is None:
        sys.exit("Image " + img_path2 + " could not be loaded.")

If the loaded images are grayscale, the method will convert to them to BGR format, because the other methods expect a three-channel image:

if len(self.img1.shape)==2:
    self.img1 = cv2.cvtColor(self.img1, cv2.COLOR_GRAY2BGR)
    self.img2 = cv2.cvtColor(self.img2, cv2.COLOR_GRAY2BGR)

In the case of the fountain sequence, all images are of a relatively high resolution. If an optional downscale flag is set, the method will downscale the images to a width of roughly 600 pixels:

# scale down image if necessary
# to something close to 600px wide
target_width = 600
if downscale and self.img1.shape[1]>target_width:
    while self.img1.shape[1] > 2*target_width:
        self.img1 = cv2.pyrDown(self.img1)
        self.img2 = cv2.pyrDown(self.img2)

Also, we need to compensate for the radial and tangential lens distortions using the distortion coefficients specified earlier (if there are any):

self.img1 = cv2.undistort(self.img1, self.K, self.d)
self.img2 = cv2.undistort(self.img2, self.K, self.d)

Finally, we are ready to move on to the meat of the project—estimating the camera motion and reconstructing the scene!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.22.41.235