Caffe (CPM Data Layer)

CPM architecture adapted in order to use the Human3.6M dataset's set of joints with the implementation of the manifold layer.

The architecture includes the manifold layer in all stages. Additionally, a fusion layer has been introduced which merge the heat-maps relative to the same joint (output of the stage and output of the manifold-layer).

Data dependencies

/data/Human3.6M/Data/ has to contain all the subject directories for testing and training the model
models/cpm_architecture/data contains the learned model for the manifold layer
python/manifold/ contains all the dependencies for the manifold layer
models/cpm_architecture/jsonDatasets contains the files for training and testing
- H36M_annotations.json (optional for testing)
- H36M_annotations_testSet.json
- H36M_masks.mat
models/cpm_architecture/lmdb contains the train lmdb database used for training the model (optional for testing)

Python dependencies

sudo pip install protobuf
sudo pip install scikit-image
sudo pip install matplotlib
sudo apt-get install python-tk
sudo pip install mpldatacursor
sudo apt-get install python-yaml

###Libraries In order to have the proper visualisation the following changes are necessary.

Edit file

site-packages\mpl_toolkits\mplot3d\axes3d.py

addind the new code in the class set up

self.pbaspect = [1.0, 1.0, 1.0]

and changing the code

xmin, xmax = self.get_xlim3d() / self.pbaspect[0]
ymin, ymax = self.get_ylim3d() / self.pbaspect[1]
zmin, zmax = self.get_zlim3d() / self.pbaspect[2]

Fine-tuning a CNN for detection with Caffe

  GLOG_logtostderr=1 build/tools/caffe train \
  -solver models/my_dir/caltech_finetune_solver_original.prototxt \
  -weights models/my_dir/bvlc_reference_caffenet.caffemodel \
  -gpu 0 2>&1 | tee results/log.txt

Multiple GPUs

Append after all the commands

  --gpu=0,1

for using two GPUs.

NOTE: each GPU runs the batchsize specified in your train_val.prototxt. So if you go from 1 GPU to 2 GPU, your effective batchsize will double. e.g. if your train_val.prototxt specified a batchsize of 256, if you run 2 GPUs your effective batch size is now 512. So you need to adjust the batchsize when running multiple GPUs and/or adjust your solver params, specifically learning rate.

Hardware Configuration Assumptions

The current implementation uses a tree reduction strategy. e.g. if there are 4 GPUs in the system, 0:1, 2:3 will exchange gradients, then 0:2 (top of the tree) will exchange gradients, 0 will calculate updated model, 0->2, and then 0->1, 2->3.

For best performance, P2P DMA access between devices is needed. Without P2P access, for example crossing PCIe root complex, data is copied through host and effective exchange bandwidth is greatly reduced.

Current implementation has a "soft" assumption that the devices being used are homogeneous. In practice, any devices of the same general class should work together, but performance and total size is limited by the smallest device being used. e.g. if you combine a TitanX and a GTX980, performance will be limited by the 980. Mixing vastly different levels of boards, e.g. Kepler and Fermi, is not supported.

"nvidia-smi topo -m" will show you the connectivity matrix. You can do P2P through PCIe bridges, but not across socket level links at this time, e.g. across CPU sockets on a multi-socket motherboard.

CNN profiling

  caffe time -model /path/to/file/structure.prototxt -iterations 10

By default this is executed in CPU-mode. If instead a GPU-mode profiling is required, this is the command:

  caffe time -model /path/to/file/structure.prototxt -gpu 0 -iterations 10

Name		Name	Last commit message	Last commit date
Latest commit History 3,544 Commits
cmake		cmake
data		data
docs		docs
examples		examples
include/caffe		include/caffe
matlab		matlab
models		models
python		python
scripts		scripts
src		src
tools		tools
.Doxyfile		.Doxyfile
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
Makefile		Makefile
Makefile.config		Makefile.config
Makefile.config.example		Makefile.config.example
README.md		README.md
caffe.cloc		caffe.cloc
test_net.sh		test_net.sh
train_net.sh		train_net.sh

License

DenisTome/caffe

Folders and files

Latest commit

History

Repository files navigation

Caffe (CPM Data Layer)

Data dependencies

Python dependencies

Fine-tuning a CNN for detection with Caffe

Multiple GPUs

Hardware Configuration Assumptions

CNN profiling

About

Resources

License

Stars

Watchers

Forks

Languages