tommyliu/visionPJ1
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
* * This is a C++ implementation of k-fan matching and supervised learning for * visual object recognition, as described in: * * Crandall, Felzenszwalb, Huttenlocher, "Spatial Priors for Part-based * Recognition using Statistical Models," CVPR 2005. * * * * * Disclaimer: this is research code with probably many bugs. * Please feel free to contact me if you have any questions or problems: * * David Crandall crandall@cs.cornell.edu * http://www.cs.cornell.edu/~crandall/research/kfans/ * * --> Note: currently this code only supports 0- and 1-fans. <-- INSTALLATION ============ This code uses two libraries: the GNU Scientific Library (gsl) and a simple image processing library. I've included both in the archive. To compile everything, just type: cd gsl-1.5; ./configure; make; cd .. cd DLib; make; cd .. make The GSL library makefiles are smart enough to auto-configure for your machine's architecture, but the other two makefiles assume you're running on a Pentium 4 with SSE3 extensions. If you're using another architecture, you'll have to edit the makefile accordingly. Please feel free to write me (crandall@cs.cornell.edu) if you have any problems with the compilation. The result is an executable called match_lite, which does both supervised learning and k-fan localization. SUPERVISED LEARNING =================== In supervised learning, a model is learned from a set of training images and a file containing the pixel coordinates of each object part in each training image. As a sample, the training data for the Caltech motorbike image set is included in this archive. This set was used for some of the experiments in our CVPR05 paper. The training images are located in the train_edges/ directory, and part locations are provided in the motorbike_training.dat file. To run the supervised training procedure on the sample data, do: ./match_lite -t motorbike_training.dat -K 1 -z 50 -o bike_model train_edges/* This will create files bike_model.1fan and bike_model.appear, containing the spatial and appearance models, respectively. Note that you can train 0-fans by using the "-K 0" option, and you can change the size of the appearance model templates using the -z option (e.g. "-z 50" means use 50x50 pixel patches). K-FAN LOCALIZATION ================== ./match_lite takes in a spatial model file and an appearance model file and a list of test images, and produces localization results for each image. As a sample test dataset, included in this archive is the CalTech motorbike test images, which were used for many of the experiments in our CVPR05 paper. The images are located in the test_edges/ subdirectory. After running the learning procedure discussed above, you can run localization wtih the models learned by that procedure on this test dataset. To do this, do: ./match_lite -A -o output -a bike_model.appear -s bike_model.1fan test_edges/* The program should output the equal ROC points at the end of the run, giving something like 97% for the 1-fan model. It also outputs localization results in the file specified with the -o option. For each image, the localization output contains the image file name, localized position of (the center of) each object part, and the likelihood ratio we use for detection (log of likelihood of MAP localization over likelihood of no object in image), like this: 10003 # image "number", just filename without path or extension test_edges/10003.dat # filename 79.6344 # likelihood ratio, higher number means higher confidence that object is present in image 6 2 # two integers: number of parts in model, and dimensionality of localization results (always 2) 96 53 # row and col of part 1 97 181 # row and col of part 2 27 131 # - etc - 48 173 23 146 59 107 IMAGE FILE FORMAT ================= ./match_lite expects a custom image file format that currently must be generated by some Matlab code. The sample images included in the archive are already in that format, but if you want to run on other images, you must use the run_edge.m Matlab script. For example, if you have an image called img.jpg and want to produce the edge map file, run: run_edge('img.jpg'); That produces an img.jpg.dat file that can then be passed into match_lite. SOURCE CODE =========== The code is relatively straightforward. Appear.cpp and Appear.h contain the appearance model matching code, and KFan.h/.cpp contain the spatial matching code. The distance transform is carried out in the DistTransform.h/.cpp files in the DLib directory. Learning is carried out by Train_KFan.cpp/h and Train_Appear.cpp/h, respectively. The 0- and 1-fan code is quite fast, taking about 0.3 seconds/image and 0.5 seconds/image, respectively, on my machine for the motorbike images. You can get an additional ~30% speed increase "for free" by compiling with the Intel compiler and the -fast option, which generates vectorized code to take advantage of the Pentium 4's SSE3 instructions.
About
visionPJ1
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published