techniques specifically for the detection of moving future research。 objects that are to be grasped by a robotic system。
They use a complex motion model for tracking
objects moving within a fixed plane。 The robot control system used in their research is very similar to ours, utilizing many processing devices (operating at different rates) through the use of predictive filtering。 However, this system requires additional modifica- tions before it can be directly applied to situations with multiple moving objects。 In addition, it assumes the use of a static camera。
Considering the visual tracking portion of our framework, we rely on the sum-of-squared differ- ences algorithm discussed by Anandan'5 as a means for calculating displacement vectors。 This algorithm has previously been used by Papanikolopoulos' in his
implementation of controlled active vision, and it has
also been used by Tomasi and Kanade l6 to measure
the suitability of feature windows for tracking。 A very interesting approach to the computer vision
aspect of the visual tracking problem has been proposed by Nayar ei of。 17
The use of tracking information as feedback to our robot control scheme is based on a MIMO adaptive
controller of Feddema and Mitchell l8 Similar
adaptive schemes have previously been used by
Koivo and Houshangi, 9 Weiss et at 20 and Nelson
and Khosla" for the vision-based control of manipulators。 Moreover, adaptive schemes have
been used by Brow 22 and Dickmaiins and Zap 23
for the control of various other mechanical systems (e。g。 robotic heads, satellites and cars)。 This research
has also been influenced by the work of Ghosh and Loucks 2 who have proposed the use of the
“perspective theory” (an elaborate adaptive scheme) for the computation of motion and structure in machine vision。
2。 DETECTION OF OBJECTS OF INTEREST
2。 I 。 Theoretical basis of the detection problem
The basis of the detection framework is that every image consists of pixels belonging to one of two categories: figure or ground。 Figure pixels are those which are believed to be part of the projection of an object of interest, while ground pixels correspond to that object’s surroundings。 If we allow the possibility of processing multiple objects, then a pixel is figure if it is part of any of the objects’ projections。 We formalize the detection problem as the identification and the analysis of figure pixels in each frame of a temporal sequence of images, i。e。 the computation of fax, y, /)