|
|
|
Programming Project #3 (proj3A) (first part) |
IMAGE WARPING and MOSAICING
(first part of a larger project)
The goal of this assignment is to get your hands dirty in different aspects of image warping with a “cool”
application -- image mosaicing. You will take two or more photographs and create an image
mosaic by registering, projective warping, resampling, and compositing them. Along the way, you will learn how to compute
homographies, and how to use them to warp images.
The steps of the assignment are:
There are some built in functions that are able to do much of what is needed. However, we
want you to write your own code. Therefore, you are not allowed to use the following
functions in your solution: cv2.findHomography, cv2.warpPerspective, cv2.getPerspectiveTransform, skimage.transform.ProjectiveTransform, skimage.transform.warp, and similar high-level homography or warping functions. On the other hand, there are a number of very helpful functions (e.g.
for solving linear systems, inverting matrices, etc) that you are welcome to use. If there is a question whether a particular
function is allowed, ask us.
Shoot two or more photographs so that the transforms between them are projective (a.k.a. perspective). The most common way is to
fix the center of projection (COP) and rotate your camera while capturing photos.
We expect you to acquire most of the data yourself, but you are free to supplement the photos you take with other sources (e.g. old photographs, scanned images, the Internet).
We're not particular about how you take your pictures or get them into the computer, but we recommend:
Good scenes are: building interiors with lots of detail, inside a canyon or forest, tall waterfalls, panoramas. The mosaic can
extend horizontally, vertically, or can tile a sphere. You might want to shoot several such image sets and choose the best.
Deliverables: Show at least 2 sets of images with projective transformations
between them (fixed center of projection, rotate camera).
Before you can warp your images into alignment, you need to recover the parameters of the transformation between each pair of
images. In our case, the transformation is a homography:
p’=Hp, where H is a 3x3 matrix
with 8 degrees of freedom (lower right corner is a scaling factor and can be set to 1). One way to recover the homography is via
a set of (p’,p) pairs of corresponding points taken from the two images. You will need to write a function of the form:
H = computeH(im1_pts,im2_pts)
where im1_pts and im2_pts
are n-by-2 matrices holding the (x,y) locations of n point correspondences from the two images and H is the recovered 3x3
homography matrix. In order to compute the entries in the matrix H, you will need to set
up a linear system of n equations (i.e. a matrix equation of the form Ah=b where h
is a vector holding the 8 unknown entries of H). If n=4, the system can be solved using a
standard technique. However, with only four points, the homography recovery will be very
unstable and prone to noise. Therefore more than 4 correspondences should be provided
producing an overdetermined system which should be solved using least-squares.
Establishing point correspondences is a tricky business. An error of a couple of pixels can produce huge changes in the
recovered homography. The typical way of providing point matches is with a mouse-clicking
interface. You can write your own using matplotlib's
ginput
function, use this tool made by a prior student, or use online tools like
https://pixspy.com
or any image editing software that displays cursor coordinates (e.g., GIMP, Photoshop).
Deliverables: Implement computeH(im1_pts, im2_pts) function. Show your
correspondences visualized on the images, system of equations, and recovered homography matrix.
Now that you know the parameters of the homography, you can use this homography to warp each image towards the reference image.
Write a function of the form:
imwarped_nn = warpImageNearestNeighbor(im,H)
imwarped_bil = warpImageBilinear(im,H)
where im is the input image to be warped and H is the homography. Use inverse warping (to avoid holes in the output image).
You must implement TWO interpolation methods from scratch (do NOT use
scipy.interpolate.griddata or similar library functions for interpolation):
Show results for both interpolation methods and compare the quality. Discuss the trade-offs between speed and quality. Some tips from staff:
Rectification: Once you get this far, you should be able to test all of your code by performing
“rectification” on an image. Take a photo containing paintings, posters, or
any other known rectangular objects and make one of them rectangular - using a homography.
You should do this before proceeding further to make sure your homography/warping is
working. Note that since here you only have one image and need to compute a homography
for, say, ground plane rectification (rotating the camera to point downward), you will need to define the correspondences using
something you know about the image. E.g. if you know that the tiles on the floor are
square, you can click on the four corners of a tile and store them in im1_pts while im2_pts you define by hand to be a square,
e.g. [0 0; 0 1; 1 0; 1 1]. This is a deliverable.
Deliverables: Implement warpImageNearestNeighbor(im, H) and warpImageBilinear(im,
H) using inverse warping. Apply to 2+ images for rectification. Show comparisons.
Warp the images so they're registered and create an image mosaic. Instead of having one picture overwrite the other, which would
lead to strong edge artifacts, use weighted averaging. You can leave one image unwarped and warp the other image(s) into its
projection, or you can warp all images into a new projection. Likewise, you can either warp
all the images at once in one shot, or add them one by one, slowly growing your mosaic.
If you choose the one-shot procedure, you should probably first determine the size of your final mosaic and then warp all your
images into that size. That way you will have a stack of images together defining the
mosaic. Now you need to blend them together to produce a single image.
If you used an alpha channel, you can apply simple feathering (weighted averaging) at every pixel.
Setting alpha for each image takes some thought. One suggestion is to set it to 1 at the
center of each (unwarped) image and make it fall off linearly until it hits 0 at the edges (or use the distance transform bwdist). However, this can produce some strange wedge-like artifacts.
You can try minimizing these by using a more sophisticated blending technique, such as a Laplacian pyramid.
If your only problem is “ghosting” of high-frequency terms, then a 2-level pyramid should be enough.
If your mosaic spans more than 180 degrees, you'll need to break it into pieces, or else use non-projective mappings, e.g.
spherical or cylindrical projection.
Deliverables: Show 3 mosaics with source images. Use weighted averaging or
blending to reduce edge artifacts. Explain your procedure.
These extensions are optional and ungraded, but you're welcome to explore them if you're interested: