Fully Convolutional Networks for Semantic Segmentation
This is the reference implementation of the models and code for the fully convolutional networks (FCNs) in the PAMI FCN and CVPR FCN papers:
Fully Convolutional Models for Semantic Segmentation
Evan Shelhamer*, Jonathan Long*, Trevor Darrell
PAMI 2016
arXiv:1605.06211
Fully Convolutional Models for Semantic Segmentation
Jonathan Long*, Evan Shelhamer*, Trevor Darrell
CVPR 2015
arXiv:1411.4038
Requirements: software
Python packages you might not have: numpy
, PIL
, python-opencv
Requirements: hardware
For training the FCN with VGG16 for VOC images(~500x350), 4G of GPU memory is sufficient (using CUDNN)
Installation (sufficient for the demo)
We'll call the directory of Seg-FCN as
FCN_ROOT
Download pre-computed Seg-FCN models
- FCN-32s PASCAL: single stream, 32 pixel prediction stride net, scoring 63.6 mIU on seg11valid
- FCN-16s PASCAL: two stream, 16 pixel prediction stride net, scoring 65.0 mIU on seg11valid
- FCN-8s PASCAL: three stream, 8 pixel prediction stride net, scoring 65.5 mIU on seg11valid and 67.2 mIU on seg12test
- FCN-8s PASCAL at-once: all-at-once, three stream, 8 pixel prediction stride net, scoring 65.4 mIU on seg11valid
cp fcn8s-heavy-pascal.caffemodel $FCN_ROOT/data/seg_fcn_models
These models were trained online with high momentum, using extra data from Hariharan et al., but excluding SBD val.
FCN-32s is fine-tuned from the ILSVRC-trained VGG-16 model, and the finer strides are then fine-tuned in turn.
The "at-once" FCN-8s is fine-tuned from VGG-16 all-at-once by scaling the skip connections to better condition optimization.
Demo
After successfully completing basic installation, you'll be ready to run the demo.
To run the demo
cd $FCN_ROOT
python infer.py
The demo performs semantic segmentation using a VGG16 network trained for semantic segmentation on SBDD.
Beyond the demo: installation for training and testing models
-
Download the SBDD(for training), VOC2011(for testing)
wget http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz wget http://host.robots.ox.ac.uk/pascal/VOC/voc2011/VOCtrainval_25-May-2011.tar
-
Extract all of these tars into the $FCN_ROOT/data, should have this basic structure
$FCN_ROOT/data/sbdd/dataset $FCN_ROOT/data/pascal/VOC2011 # ... and several other directories ...
Follow the next sections to download pre-trained ImageNet models
Download pre-trained ImageNet models
Pre-trained ImageNet models can be downloaded for backbone net: VGG16.
Transplant a fully-connected net into a fully-convolution net
cp VGG16.v2.caffemodel $FCN_ROOT/transplant/VGG16
cd $FCN_ROOT/transplant/VGG16
python solve.py
This script will generate a new model VGG16.fcn.caffemodel
for training.
Training
FCN prefers two training methods:
-
CVPR version:
First, Train FCN-32s for 1 day.
Then, Train FCN-16s fintune from FCN-32s for 1 day.
Final, Train FCN-8s fintune from FCN-16s for 1 day.
Follow this way, you should run $FCN_ROOT/voc-fcn32s | voc-fcn16s | fcn-8s/solve.py
sequentially
. -
PAMI version:
Directly run $FCN_ROOT/voc-fcn8s-atonce/solve.py
Both of above ways train same iterations, PAMI ver.
is simpier and got similar results.
Trained Seg-FCN networks are saved under:
voc-fcnxs/snapshot/
Test outputs are saved under:
voc-fcnxs/segs/