The Machine Perception Toolbox (MPT)

overloaded "search" function, which in this case, computes the filter on the image.
The results are written into an image and displayed using ImageMagick. The user must close each window to get the next one.
00179                                                  {
00180 
00181   // stream.images is an array of pointers to the RImage<T> base
00182   // class.  We cast stream.images[0] as a pointer to its derived
00183   // class RIntegral, and call the integrate method.  Javier: Why not
00184   // let RImage have an integrate method itself?  Ian: Because RImage
00185   // is the *base* class from which RIntegral and RDerivative are
00186   // derived.  We do this because we want to be able to use lists of
00187   // images of different types.  Thus we use a list of pointers to the
00188   // base class, even though in this case we are always looking at
00189   // RIntegrals.  In the future, I think a good idea might be to use
00190   // the mixed list type from the Boost Library, which allows mixed
00191   // types in the list.  Then we wouldn't have to typescast the
00192   // pointer.  If we do that, then perhaps we should also think about
00193   // using Boost shared_pointers, which are reference counted, so we
00194   // don't have to worry about memory management as much.  Currently
00195   // memory is managed pretty well by hand, but as libmpisearch
00196   // evolves, use of shared_pointers would make development of less
00197   // error-prone code a lot easier.  The only possible downside would
00198   // be a potential speed hit from shared_pointer, but I think this
00199   // could be minimized, even down to *no* effective penalty.
00200   static_cast<RIntegral< float >* >(stream.images[0])->integrate(pixels);
00201   
00202   float a,b,c,d,s,mean;
00203   int scale_index, x, y;
00204   float scale_factor, area;
00205   unsigned int rect_ind = 0, cent_surr_ind = 1; 
00206   int numrectcorners = data.features[rect_ind].numcorners;
00207   int numcent_surrcorners = data.features[cent_surr_ind].numcorners;
00208   int numWindows = 0;
00209 
00210 
00211     // An image pyramid consists of scales and scales consist of
00212     // windows.  The outer for loop below iterates through scales. The
00213     // inner loop iterates over windows within each scale.
00214 
00215 
00216 
00217 
00218   // Get begin and end iterators for the outer loop.  mpi is a
00219   // datamember of the class MPISearchStream and it is of type
00220   // MPIImagePyramid.  MPIImagePyramid is a container class
00221   // representing all the patches at all scales in an image.
00222 
00223    
00224   MPIImagePyramid< float >::const_iterator scale = stream.mpi->begin();
00225   MPIImagePyramid< float >::const_iterator last_scale = stream.mpi->end();
00226   
00227   for( ; scale != last_scale; ++scale){
00228 
00229     // Retreive the cached info about this scale
00230 
00231     scale_index = scale.getScale(scale_factor);
00232 
00233     //The class CornerCache has 4 members: scaledCornerX,
00234     //scaledCornerY, scaledIndex, value 
00235 
00236 
00237     // Javier: What are CornerCache
00238     //used for?  stream is of type MPISearchStream. corners is a
00239     //private datamember of MPISearchStream. Corners is of type
00240     //vector<CornerCache<T>**>
00241 
00242     CornerCache< float > **corners = stream.corners[scale_index];
00243     CornerCache< float > *rect_corners = corners[rect_ind];
00244     CornerCache< float > *cs_corners = corners[cent_surr_ind];
00245   
00246     
00247 
00248     // stream.fns[i] is a vector which counts, at each scale i, the
00249     // total number of additions or subtraction of a pixel by each
00250     // feature. This can be used to remove the DC component of the
00251     // filter, i.e, we want for the filter to have zero output to
00252     // patches that have pixels of constant value.  For instance,
00253     // suppose a feature operates on a 5x5 patch by adding the 25
00254     // pixels of the patch and then substracting the center pixel. If
00255     // presented with a 5x5 patch in which all the pixels have a
00256     // constant intensity x then the output of the filter would be
00257     // (25-1) * x . If we want for the filter to have zero output to
00258     // constant valued patches we would then have to substract (25 -1)
00259     // * mean, where mean is the average value of the 25 pixels in the
00260     // patch.
00261 
00262 
00263     // Below this comment, we store the inverse of the number of
00264     // pixels in the 0th feature ahead of time, which helps us compute
00265     // the mean.  Why the inverse?  So we don't have to issue a divide
00266     // at each step, rather we can issue a mult, which is much faster
00267     // on most CPUs.
00268     
00269     // Javier: Should not we take into consideration the weights of
00270     // the feature componets. In our case, for example, the
00271 
00272     float one_over_rect_area = 1.0 / stream.fns[scale_index][rect_ind];
00273 
00274 
00275 
00276     // Finally, we will have to scale the image to between 0 and 1 at
00277     // the end due to ImageMagick's assumptions about how float images
00278     // are represented.
00279     float minval = 99999, maxval= -99999;
00280     
00281 
00282 
00283 
00284     // A pyramid consists of scales and scales consist of windows. We
00285     // now iterate over the windows within each scale. 
00286 
00287 
00288 
00289     MPIScaledImage< float >::const_iterator window = (*scale).begin(), last_window = (*scale).end();
00290     for( ; window != last_window; ++window, ++numWindows){      
00291       // First, compute sum of entire patch, so we can remove DC component.
00292       // since we know this is just a square, we don't need any for loops
00293       a = window.getPixel0( rect_corners[0].scaledIndex ) * rect_corners[0].value;
00294       b = window.getPixel0( rect_corners[1].scaledIndex ) * rect_corners[1].value;
00295       c = window.getPixel0( rect_corners[2].scaledIndex ) * rect_corners[2].value;
00296       d = window.getPixel0( rect_corners[3].scaledIndex ) * rect_corners[3].value;
00297       mean = a + b + c + d;
00298       mean *= one_over_rect_area;
00299       
00300       // Now, compute the center surround feature. In this case, we
00301       // could compute all 8 corners explicitly as above, but insetad
00302       // lets do it in a more generic way, so that we don't have to
00303       // know anything about the feature a priori. Done this way, we
00304       // can compute an arbitrary feature stored as feature 1 of
00305       // "data".  We could generalize this even more by looping over
00306       // data.numfeatures instead of specifying the center_surround
00307       // feature as we do here.  This way, you could compute N
00308       // features in an identical way.  To see this in action, look at
00309       // MPISearchObjectDetector.classifyWindow
00310       //
00311       // Instead of looping over every corner, we unroll this loop a
00312       // little, which clues the compiler in to the fact that the
00313       // value of each corner (or at least, each group of 4 corners)
00314       // is independent.  This allows it to schedule instructions in
00315       // such a way that achieved a 40% speed improvement on a
00316       // PowerMac G4 using gcc 3.1 due to efficient use of the
00317       // pipeline. This sets the stage for vectorization using AltiVec
00318       // or SSD, if someone ever wants to try this.
00319       s = 0;
00320       for(int corner = 0; corner < numcent_surrcorners; corner+=4) {
00321         a = window.getPixel0( cs_corners[corner].scaledIndex   ) * cs_corners[corner].value;
00322         b = window.getPixel0( cs_corners[corner+1].scaledIndex ) * cs_corners[corner+1].value;
00323         c = window.getPixel0( cs_corners[corner+2].scaledIndex ) * cs_corners[corner+2].value;
00324         d = window.getPixel0( cs_corners[corner+3].scaledIndex ) * cs_corners[corner+3].value;
00325         s += a + b + c + d;
00326       }
00327       // Now, remove the DC component. This is done by counting the
00328       // number of times a pixel was added or subtracted ( stored in
00329       // stream.fns ), multiplying by the mean, then subtracting it
00330       // from the total.
00331       s -= mean*stream.fns[scale_index][cent_surr_ind];
00332 
00333       // Now set every pixel of the corresponding block in our output image to s
00334       for(int i = 0; i < scale_factor; ++i)
00335         for(int j = 0; j < scale_factor; ++j)
00336           window.setPixel(1, j, i, s);
00337 
00338       // Finally, because of the way we are displaying the image, we
00339       // need to get min and max vals.
00340       if(s < minval)
00341         minval = s;
00342       if(s > maxval)
00343         maxval = s;
00344       window.getCoords(x,y);
00345     }
00346     // We're finished, so now display the result.  
00347     // First, make sure all values are between 0 and 1;
00348     float range = maxval - minval; 
00349     float one_over_range = 1.0/range;
00350     unsigned int numpixels = stream.images[1]->width*stream.images[1]->height;
00351     float * p = stream.images[1]->array;
00352     for(unsigned int i = 0; i < numpixels ; ++i)
00353       *p = (*p++ - minval)*one_over_range;
00354     // Now, copy the pixels into an ImageMagick image.
00355     Image img((const unsigned int)stream.images[1]->width,(const unsigned int)stream.images[1]->height,
00356               "I",FloatPixel,stream.images[1]->array);
00357 
00358     // Crop and scale the image for visibility.
00359     const Geometry cropregion(x+scale_factor,y+scale_factor);
00360     img.crop(cropregion);
00361     const Geometry newsize(stream.images[1]->width-1,stream.images[1]->height-1);
00362     img.scale(newsize);
00363     img.type(GrayscaleType);
00364     img.fontPointsize(40);
00365     //Geometry(width, height, xoffset, yoffset);
00366     char tmpstring[20];
00367     sprintf(tmpstring,"Scale %d", (int)scale_factor);
00368     img.annotate(tmpstring, Geometry(50,50, 40, 40) );
00369     img.display();
00370     // img.write("Output2.jpg");
00371   }
00372 }

Public Member Functions
	CenterSurround ()
	Create a FeatureData object which describes the basic classifier (at scale = 1).
void	search (RImage< float > &pixels)
	overloaded "search" function, which in this case, computes the filter on the image.
	~CenterSurround ()
The Machine Perception Toolbox

CenterSurround Class Reference

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

Member Function Documentation