MXNet baseline model for iNaturalist Challenge at FGVC 2017 competition

I have prepared for a baseline model using MXNet for iNaturalist Challenge at FGVC 2017 competition on Kaggle. Github link is the public LB score is 0.117. Please follow this discussion thread if any questions.

How to use

Install MXNet

Run pip install mxnet-cu80 after installing CUDA driver or go to for the latest version from Github.

Windows users? no CUDA 8.0? no GPU? Please run pip search mxnet and find the good package for your platform.

Generate lists

After downloading and unzipping the train and test set in to data, along with the necessary .json annotation files, run python under data and generate train.lst val.lst test.lst

Generate rec files

A good way to speed up training is maximizing the IO by using .rec format, which also provides convenience of data augmentation. In the data/ directory, can generate train.rec and val.rec for the train and validate datasets, and can be obtained from MXNet repo . One can adjust --quality 95 parameter to lower quality for saving disk space, but it may take risk of loosing training precision.


Run sh which looks like (a 4 GTX 1080 machine for example):

python --pretrained-model model/resnet-152 \
    --load-epoch 0 --gpus 0,1,2,3 \
    --model-prefix model/iNat-resnet-152 \
	--data-nthreads 48 \
    --batch-size 48 --num-classes 5089 --num-examples 579184

please adjust --gpus and --batch-size according to the machine configuration. A sample calculation: batch-size = 12 can use 8 GB memory on a GTX 1080, so --batch-size 48 is good for a 4-GPU machine.

Please have internet connection for the first time run because needs to download the pretrained model from If the machine has no internet connection, please download the corresponding model files from other machines, and ship to model/ directory.

Generate submission file

After a long run of some epochs, e.g. 30 epochs, we can select some epochs for the submission file. Run sub.pywhich two parameters : num of epoch and gpu id like:

python 21 0

selects the 21st epoch and infer on GPU #0. One can merge multiple epoch results on different GPUs and ensemble for a good submission file.

How ‘fine-tune’ works

Fine-tune method starts with loading a pretrained ResNet 152 layers (Imagenet 11k classes) from MXNet model zoo, where the model has gained some prediction power, and applies the new data by learning from provided data.

The key technique is from lr_step_epochs where we assign a small learning rate and less regularizations when approach to certain epochs. In this example, we give lr_step_epochs='10,20' which means the learning rate changes slower when approach to 10th and 20th epoch, so the fine-tune procedure can converge the network and learn from the provided new samples. A similar thought is applied to the data augmentations where fine tune is given less augmentation. This technique is described in Mu’s thesis

This pipeline is not limited to ResNet-152 pretrained model. Please experiment the fine tune method with other models, like ResNet 101, Inception, from MXNet’s model zoo by following this tutorial and this sample code . Please feel free submit issues and/or pull requests and/or discuss on the Kaggle forum if have better results.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s