Skip to content

tommyliu/visionPJ1

Repository files navigation

*
* This is a C++ implementation of k-fan matching and supervised learning for
* visual object recognition, as described in: 
*
*    Crandall, Felzenszwalb, Huttenlocher, "Spatial Priors for Part-based
*        Recognition using Statistical Models," CVPR 2005.
*
*
*
* 
* Disclaimer: this is research code with probably many bugs.
* Please feel free to contact me if you have any questions or problems:
*
*   David Crandall   crandall@cs.cornell.edu 
*     http://www.cs.cornell.edu/~crandall/research/kfans/
*
*

--> Note: currently this code only supports 0- and 1-fans. <--


INSTALLATION
============

This code uses two libraries: the GNU Scientific Library (gsl) and a
simple image processing library. I've included both in the archive. To
compile everything, just type:

  cd gsl-1.5; ./configure; make; cd ..
  cd DLib; make; cd ..
  make

The GSL library makefiles are smart enough to auto-configure for your
machine's architecture, but the other two makefiles assume you're
running on a Pentium 4 with SSE3 extensions. If you're using another
architecture, you'll have to edit the makefile accordingly.  Please
feel free to write me (crandall@cs.cornell.edu) if you have any
problems with the compilation.

The result is an executable called match_lite, which does both
supervised learning and k-fan localization.



SUPERVISED LEARNING
===================

In supervised learning, a model is learned from a set of training
images and a file containing the pixel coordinates of each object
part in each training image.

As a sample, the training data for the Caltech motorbike image set
is included in this archive. This set was used for some of the
experiments in our CVPR05 paper. The training images are located in
the train_edges/ directory, and part locations are provided in the
motorbike_training.dat file.

To run the supervised training procedure on the sample data, do:

  ./match_lite -t motorbike_training.dat -K 1 -z 50 -o bike_model train_edges/*

This will create files bike_model.1fan and bike_model.appear, containing
the spatial and appearance models, respectively. Note that you can
train 0-fans by using the "-K 0" option, and you can change the size
of the appearance model templates using the -z option (e.g. "-z 50"
means use 50x50 pixel patches).



K-FAN LOCALIZATION
==================

./match_lite takes in a spatial model file and an appearance model
file and a list of test images, and produces localization results
for each image. As a sample test dataset, included in this archive
is the CalTech motorbike test images, which were used for many
of the experiments in our CVPR05 paper. The images are located
in the test_edges/ subdirectory.

After running the learning procedure discussed above, you can
run localization wtih the models learned by that procedure on this
test dataset. To do this, do:

./match_lite -A -o output -a bike_model.appear -s bike_model.1fan test_edges/* 

The program should output the equal ROC points at the end of the run,
giving something like 97% for the 1-fan model. It also outputs
localization results in the file specified with the -o
option. For each image, the localization output contains the image file
name, localized position of (the center of) each object part, and the
likelihood ratio we use for detection (log of likelihood of MAP
localization over likelihood of no object in image), like this:

10003                   # image "number", just filename without path or extension
test_edges/10003.dat	# filename
79.6344			# likelihood ratio, higher number means higher confidence that object is present in image
6 2                     # two integers: number of parts in model, and dimensionality of localization results (always 2)	
96 53			# row and col of part 1
97 181			# row and col of part 2
27 131			#    - etc - 		
48 173
23 146
59 107


IMAGE FILE FORMAT
=================

./match_lite expects a custom image file format that currently must be
generated by some Matlab code. The sample images included in the
archive are already in that format, but if you want to run on other
images, you must use the run_edge.m Matlab script. For example, if you
have an image called img.jpg and want to produce the edge map file,
run:

run_edge('img.jpg');

That produces an img.jpg.dat file that can then be passed into match_lite.


SOURCE CODE
===========

The code is relatively straightforward. Appear.cpp and Appear.h
contain the appearance model matching code, and KFan.h/.cpp contain
the spatial matching code. The distance transform is carried out in
the DistTransform.h/.cpp files in the DLib directory. Learning is
carried out by Train_KFan.cpp/h and Train_Appear.cpp/h, respectively.

The 0- and 1-fan code is quite fast, taking about 0.3 seconds/image
and 0.5 seconds/image, respectively, on my machine for the motorbike
images. You can get an additional ~30% speed increase "for free" by
compiling with the Intel compiler and the -fast option, which
generates vectorized code to take advantage of the Pentium 4's SSE3
instructions.