Robotic Vision

Z SensorWiki

Support page to the MOOC Course by Peter Corke

Further reading

Vision in humans and nature

   Stone, J. V. (2012).Vision and Brain: How We Perceive the World. Cambridge, MA: MIT Press.
   Ings, S. (2007). The Eye: A Natural History. London: Bloomsbury.
   Land, M. & Nilsson, D-E. (2002). Animal eyes. New York: Oxford University Press.
   Hughes, H. (1999). Sensory Exotica: A World beyond Human Experience. Cambridge, MA: MIT Press.

Computer and robot vision in general

   Corke, P. (2013). Robotics, Vision and Control. Berlin: Springer.
   Szeliski, R. (2011). Computer vision: Algorithms and applications. New York: Springer.

Multiview/3D vision specifically

   Hartley, R. & Zisserman, A. (2003). Multiple view geometry in computer vision. Cambridge: University Press.
   Ma, Y., Soatto, S., Košecká, J., & Sastry, S. (2006). An invitation to 3-D vision: From images to geometric models. New York: Springer.

Matlab code

 im = iread('/path/to/image.jpg');
 idisp(im); // special display for the course from the vision package

 im = iread('monalisa.png', 'grey', 'double');  // 0.0 - 1.0 instead of 0 - 255

Compare this

a = 100;
b = 200;


a = uint8(100);
b = uint8(200);


Getting an image from videocamera. Grab and display a image from the camera.

 cam = VideoCamera(0);
 im = cam.grab();
 clear cam // In order to turn the camera off, use the clear function.

Getting an image from movie.

cam = Movie('traffic_sequence.mpg');
cam.grab(); // next sequence image

Creating an image from code

Create a 200x200 matrix of zeros (representing black) and set the first (top, left) element to 1 (representing white). Display the image.

 im = zeros(200, 200);
 im(1,1) = 1;

 im(150:160,60:70) = 0.5;

 circle = kcircle(30);

 im = ipaste(im, circle*0.7, [100 30]);

 im = iline(im, [30 40], [150 190], 0.9);

 im = testpattern('rampx', 200);

 im = testpattern('squares', 200, 50, 25);

 im = iread('monalisa.png', 'grey');
 about im

 ans = 26

 im2 = im(180:210,260:290);

 lin = im(190,260:290);

 im3 = im(1:4:end,1:4:end);

 im4 = im(end:-1:1,1:end);


 im = iread('penguins.png', 'grey', 'double');
 b = im > 0.45;

 // Use the slider to adjust the threshold setting.

Diadic Operations

 cam = Movie('traffic_sequence.mpg', 'double', 'grey');  // double is important, because uint8 can't do e.g. 50 - 65

 // Extract the first and second frames from the movie. 
 im1 = cam.grab();
 im2 = cam.grab();

 // Display the image difference.

 idisp(im1-im2, 'invsigned')  // colors: red / blue for positive / negative values

Spatial operators: Moving average

  % Load the Mona Lisa image.
  im = iread('monalisa.png', 'grey', 'double');

  %  Perform an average over a 7x7 window by 
  %  correlating the image with a 7x7 matrix of 1's using the 
  %  toolbox function iconv(). Display with the original image.

  s7 = iconv(im, ones(7,7));
  % again with 21x21 window. 
  s21 = iconv(im, ones(21,21));

Introducing kernels

Create a Gaussian kernal with a standard deviation of 5 and a half radius of 15 pixels using the kgauss() function.

 K = kgauss(5, 15);
 % Display the kernel a greycale image.

 % Load the Mona Lisa image and correlate this with the Gaussian kernel. Display the image.
 im = iread('monalisa.png', 'grey', 'double');
 s = iconv(K, im);


 // Load the penguins.jpg image and display.
 im = iread('penguins.jpg','grey','double');

 ... zrejme Gaussian and Laplacian kernels application

Template matching

Load the wheres_walle.png image as greyscale and of type double. Display the image. Then load a pattern / template.

Using the function isimilarity() from Toolbox we can perform a template matching using ZNCC similarity function. This will take several minutes.

Similarity function return values between 0 and 1, where values approaching 1 is where the template is a good fit.

We can search for the position of best fit by using the toolbox function peak2() and passing in the similarity image, the pixel distance in which to test (i.e. we want to find the pixel which is brightest compared to the pixels within a distance of it) and let's find the top 5 matches.

The first of the output variables contains the similarity value of the top 5 matches and the second variable contains the related positions for these matches. We can locate and label these positions using the plot_circle() and plot_point() functions. For plot_circle() we want to plot circles of radius 30 pixels, of a blue color, an alpha value of 0.3 making them translucent and with no edge color. For plot_point we want to label them in sequence, with a font size of 36 and in the color yellow.

We can extract the region around the point of highest similarity using the toolbox function iroi() - display RegionOfInterest. We want to do this for a region of 50x50 pixels about the first point of p which was obtained earlier.

  crowd = iread('wheres_walle.png','grey', 'double');

  bender = iread('bender.png', 'double');

  S = isimilarity(bender, crowd, @zncc);

  [mx, p] = peak2(S, 1, 'npeaks', 5)
  % mx =
  %      0.6107    0.5580    0.5277    0.5092    0.4931
  % p =
  %     331        1165        1164         293        1013
  %     364         852         861         619         789

  plot_circle(p, 30, 'fillcolor', 'b', 'alpha', 0.3, 'edgecolor', 'none')
  plot_point(p, 'sequence', 'bold', 'textsize', 36, 'textcolor', 'y')

  iroi(crowd, p(:,1), 50)


Rank filtering

Use e.g. for salt&pepper noise removal. Here, we use a 3x3 window and choose the median value (rank 5). Why rank 5? 3x3 = 9, median = (9-1):2 + 1 = 5. Median is a middle value. Mid = (max - min) : 2

 out = irank(img, ones(3,3),5);

5. Feature extraction

Binary blobs

  im = iread('shark1.png');

   % Now, locate for COORDINATES of all non-zero points (white blob)
  [v,u] = find(im>0);

   % Now find a maximum and minimum coordinates
  umin = min(u)
  umax = max(u)
  vmin = min(v) 
  vmax = max(v)
  % and draw a green ('g') bounding box of the object
  plot_box(umin, vmin, umax, vmax, 'g');

  % find the center of this box, a rough estimation of the center of the shark, by finding the 
  % average of the minimum and maximum coordinates. Plot an asterix at this location.
  uc = (umin+umax)/2
  vc = (vmin+vmax)/2
  plot_point([uc vc]', '*')

  % A better estimation of the center of the shark can be found by calculating the centroid

  m00 = mpq(im, 0, 0)
  m10 = mpq(im, 1, 0)
  m01 = mpq(im, 0, 1)

  ucen = m10 / m00
  vcen = m01 / m00 
  plot_point([ucen vcen]', 'o')   % note the transposition ' of coordinates

Multiple blobs

  im = iread('shark2.png');

  [L, nb] = ilabel(im);

  idisp(L == 1)

  b = iblobs(im)

  b(1).area          % area:    the number of pixels
  b(1).class         % class:   the value of the pixels forming this region
  b(1).p             % p:       centroid (uc, vc)
  b(1).uc            % uc:      centroid, horizontal coordinate
  b(1).moments.m01   % moments: a structure containing moments of order 0 to 2

Blob hierarchy

   im = iread('multiblobs.png');

  [L,nb] = ilabel(im);

  b = iblobs(im)

Blob shape and orientation

  im = iread('shark2a.png');

  b = iblobs(im)
  b = iblobs(im, 'class', 1)

6. Color

See also:

   im = iread('flowers8.png');
   % Display RED image plane only>
   idisp( im(:,:,1) )

   % Green plane:
   idisp( im(:,:,2) )

   % Blue plane:
   idisp( im(:,:,3) )

   % RGB Value of a particular pixel
   %  However this returns the result in an inconvenient form.

   % function squeeze() helps remove the singleton dimesions from the result, 
   % returning the RGB value in a vector form.
   squeeze( im(100,100,:) )

   % Mixing colors
   % %%%%%%%%%%%%%

   colorname('maroon')   % looking into the database of more than 500 colornames
   colorname([0.1 0.2 0.3])  % an opposite operation

   % Create the 1-pix color image:
   pix = ones(1,1,3)
   pix(1,1,:) = colorname('maroon')

   % We can view the RGB color cube with the toolbox command rgbcube.
   % Use a mouse to rotate in 3D
   rgbcube  % probably not included in toolbox?

Planck's law

   % nanometer = 1x10^-9  
   nm = 1e-9;
   % create a vector of wavelengths from 300 to 1000 nm in steps of 10
   lambda = [300:10:1000]*nm;

   % blackbody radiator emission curve 
   E = blackbody(lambda, 6500);  %  Let's use the temperature of the sun, 6500K.
   plot(lambda, E)

   E = blackbody(lambda, 6000);
   hold on
   plot(lambda, E, '--')

Separating intensity from color

 im = iread('tomato_124.jpg', 'double');

 % perform gamma correction using an additional parameters
 % to eliminate non-linearities
 im = iread('tomato_124.jpg', 'double', 'gamma', 2.2);

 % view the individual color planes - red plane:

 % the green plane:

 % Extract each of the color planes into separate matrices 
 % and calculate the luminance image by summing these matrices. Display the luminence image.
 R = im(:,:,1);
 G = im(:,:,2);
 B = im(:,:,3);
 Y = R + G + B;


 % This luminance image can be used to calculate the chromaticity coordinates using 
 % the element-wise division operator (./). Display these chromaticity channels.
 r = R ./ Y;
 g = G ./ Y;


 % We can attempt to extract the tomato fruit in the image by performing thresholding 
 % and logical operations on the red and green chromaticity channels. Display this binary image.
 tom = (r > 0.8 & g < 0.2);

 % We can use the iblobs() Toolbox function to calculate a number of properties on the blobs in the 
 % image. Which can be used to plot bounding boxes on the orignal gamma-corrected image.

 b = iblobs(tom)
 % b = (1) area=72442, cent=(162.1,115.9), theta=-3.08, b/a=0.727, color=0, label=1, touch=1, parent=0
 %     (2) area=588, cent=(128.9,130.2), theta=2.87, b/a=0.329, color=1, label=2, touch=0, parent=1
 %      ...


Camera modelling

 % Create a point.
 P = [0.2 0.3 5]'

 % Create a CentralCamera object with a focal length of 8mm.
 cam = CentralCamera('focal', 0.008)

 % Project the world point onto the image plane.

 % There is no notion of pixels here however a digital camera can be modeled by using additional input 
 % parameters. Define the pixel size and the resolution.
 cam = CentralCamera('focal', 0.008, 'pixel', 10e-6, 'resolution',1024)

 % Project the world point P onto the image plane.

 % The image plane can be plotted using the method plot.

 % We can access the intrinsic parameters of the camera.

 % And the overall camera projection matrix.

 % We can pass an additional option to the plot method which shifts the camera as desired, here 
 % we are translating 0.1m in the x-direction. we can see that as we are moving the camera towards 
 % the right, the point is moving towards the left on the image plane.
 cam.project(P, 'Tcam', transl(0.1, 0, 0))

 % The camera can also be rotated. Here we rotate the camera 0.1 radians about the y-axis. 
 % Essentially this is rotating the camera towards the right.
 cam.project(P, 'Tcam', troty(0.1))

 % Use the toolbox function mkgrid() to make a grid of points spaced by 0.2 m and located 
 % about the point (1,2,0).
 P = mkgrid(2, 0.2, 'T', transl(1, 2, 0))

 % Create a new camera with default parameters except that it is translated 4m along the 
 % z-axis, rotated about the z-axis by pi/3 radians and finally rotated about the y-axis 
 % by pi radians such that it is pointing towards the ground.
 cam = CentralCamera('default', 'pose', transl(0,0,4)*trotz(pi/3)*troty(pi))

 % Project the grid of points onto the image plane.

 % Plot the image plane.

 % The camera and points can be visualised in 3 dimensions.
 plot_sphere(P, 0.05)


Create a MATLAB CentralCamera object

>> cam=CentralCamera('default', 'pose', transl(0,1,2)*troty(pi/2)*trotz(-pi/2));

Create a graphical working volume 10x10x10m

>> axis([-5 5 -5 5 -5 5 ])>

Display the camera in this environment

>> cam.plot_camera

Define a point in the world

>> P = [4 0 0]';

and display it as a red sphere of 0.2m radius

>> plot_sphere(P, 0.2)

Now we can project the point to the image plane

>> cam.project(P)

or even visualise what the point looks likes on the image plane

>> cam.plot(P)

Try some variations on this scenario. Pehaps move the point, or the camera, or change the focal length of the camera. See the documentation for details.

>> help CentralCamera

Using homographies

... continuing:

 % create a new variable which contains the projection of the world points on the image plane.
 p = cam.project(P)

 % Extract the coordinates of the points on the ground plane.
 Q = P(1:2,:);

 % With p and Q we can calculate the homography matrix using the toolbox function homography().
 H = homography(p, Q)

 % We can apply the homography to the set of image plane coordinates which gives the coordinates of 
 % the points on the ground plane.
 homtrans(H, p)

 % We can use the inverse of the homography matrix to do the opposite. Here we find the image plane 
 % coordinates given the coordinates of the point on the world ground plane.
  homtrans(inv(H), Q)


 % Load the tomato_125.jpg image, applying gamma correction. Display the image.
 im = iread('tomato_124.jpg', 'gamma', 2.2);

 % Use the Toolbox function colorkmeans() to perform k means classification for 4 colors. The 
 % function has 3 output arguments of which we are only looking at the first in this lecture. The output 
 % argument is the class image. The function accepts as input arguments the image to operate on and 
 % the number of colours to classify.
 [cls,cxy,resid] = colorkmeans(im, 4);

 % Display the class image.

 % The algorithm uses a randomised initialisation and hence the class labels will not be the same each 
 % time. In this instance the tomato red color has been given the class label of 3. We can create a binary 
 % image for all the pixels that belong to class 3.
 tom = cls == 3;

 % We can see that the fruit are not complete due to specular reflection and occlusion. We can use some 
 % morphological processing operations to patch the holes in the fruit. Here we will use the Toolbox 
 % function iclose() which accepts a binary image and a kernal as input and returns the closed binary 
 % image as output. We will use a 15x15 square kernel to begin with. Display the result.
 closed = iclose(tom, ones(15,15) );

 % We can see that a square may not have been the best choice given the round shape of the tomatoes. 
 % Let's instead try using a circular kernel which can be created using the Toolbox function kcircle(). 
 % First display the kernel to see what it looks like.
 se = kcircle(15);

 % And now try the closing operation using this kernel.
 closed = iclose(tom, se);