In this tutorial, we walkthrough how to train YOLOv4 Darknet for state-of-the-art object detection on your own dataset, with varying number of classes.

Darknet logo.
Train YOLOv4 on a custom dataset with this tutorial on Darknet! (photo credit)

YOLOv5 is Out!

If you're here for the Darknet, stay for the darknet. Otherwise consider running the YOLOv5 PyTorch tutorial in Colab. You'll have a very performant, trained YOLOv5 model on your custom data in a matter of minutes.

We will take the following steps to implement YOLOv4 on our custom data:

Impatient? Jump to our YOLOv4 Colab notebook.

Introduction

Object detection models continue to get better, increasing in both performance and speed. In the realtime object detection space, YOLOv3 (released April 8, 2018) has been a popular choice, as has EfficientDet (released April 3rd, 2020) by the Google Brain team. Progress continues with the recent release of YOLOv4 (released April 23rd, 2020), which has been shown to be the new object detection champion by standard metrics on COCO.

MS COCO Object Detection: Average Precision vs FPS for Efficientnet, YOLOv4, YOLOv3, and ASFF.
YOLOv4 performance from the paper. (Citation)

These general object detection models are proven out on the COCO dataset which contains a wide range of objects and classes with the idea that if they can perform well on that task, they will generalize well to new datasets. However, applying the deep learning techniques used in research can be difficult in practice on custom objects. We have been working to make that transition easy and have released similar tutorials in the past including:

This post builds on prior models in being among the first to help you implement YOLOv4 to a custom dataset – not just objects included in the COCO dataset. By using YOLOv4, you are implementing many of the past research contributions in the YOLO family along with a series of new contributions unique to YOLOv4 including new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss. In short, with YOLOv4, you're using a better object detection network architecture and new data augmentation techniques.

In this tutorial, we use the Darknet framework because the ability to train YOLOv4 in TensorFlow, Keras, and PyTorch frameworks is still under construction. (Update: YOLO v4 PyTorch implementation now available.)

If you would like to learn more about the research contributions made by YOLOv4, we recommend reading the following:

What is the Darknet Framework?

It's not TensorFlow, nor is it PyTorch, and it is most certainly is not Keras. It is a custom framework written by Joseph Redmon (whom, by the way, has a phenomenally fun resume). While Darknet is not as intuitive to use, it is immensely flexible, and it advances state-of-the-art object detection results.

In this post, we'll be using Darknet to implement YOLOv4. Along the way, we'll demystify the difficulties getting Darknet setup within Colab. Stay tuned for future posts where we'll implement YOLOv4 in PyTorch, YOLOv4 in TensorFlow, and YOLOv4 in Keras.

Alright let's get to it! We recommend reading this blog post along side the Colab notebook.

Configuring our GPU Environment for YOLOv4 on Google Colab

For compute, we are going to use Google Colab. Google Colab is a Python Jupyter notebook that runs on a GPU. Google Colab is free to use and, optionally, $10/month to upgrade to a Pro account.

You can use this tutorial on your local machine as well, but configurations will be slightly different. Regardless of environment, the important things we will need to train YOLOv4 are the following:

  • GPU with specific GPU drivers installed
  • OpenCV
  • cuDNN configured on top of GPU drivers

For the next steps, open our YOLOv4 Darknet Colab notebook.

Thankfully, Google Colab takes care of the first two for us, so we only need to configure cuDNN.

Terminal screenshot showing the results of nvidia-smi in Colab.
Colab gives you OpenCV and a GPU by default

Configuring cuDNN for YOLOv4

NVIDIA has not let Google Colab pre-install cuDNN yet because NVIDIA controls its distribution. We'll handle that in a moment.

First, we check the NVIDIA Cuda drivers to see which version we are. The cuDNN tar file we download will change based on this. (This is included in our notebook.)

Terminal screenshot showing CUDA drivers and cuDNN.
Checking Cuda driver version

To acquire cuDNN install files, head to https://developer.nvidia.com/cudnn and choose the Linux install file that matches your NVIDIA Cuda drivers. In our case it was, cudnn-10.1-linux-x64-v7.6.5.32.tgz.

Next, we port the download file into Colab to install the cuDNN with Google Drive. In Google Drive, you can drop the file and then you can read from Google Drive in Colab. If you have another means of bringing in a file (e.g. you are on local), feel free to use it!

Next, we install cuDNN by unzipping the .tar file in our usr/local folder. You will know you are successful with the following command and printout.

Terminal screenshot showing the results of nvcc -V
Double checking cuDNN install

✅ Check.

Install the Darknet YOLO v4 training environment

Next, we clone our fork of the Darknet YOLO v4 repository. We have made a few minor tweaks to remove print statements and to change the Makefile to play well with Google Colab. If you are on a local machine (not Colab), have a look at the Makefile for your machine.

Colab Free Tier K80 GPU Note: the Makefile in this tutorial was built for the P100 GPU accelerator that is typically provision on Colab Pro. If you are on the Colab free tier, you might receive a K80 GPU, seen above with nvidia-smi. In that case, in the Makefile you will likely need to change the architecture specified. For K80:

ARCH= -gencode arch=compute_30,code=sm_30

Moving along, after we have clone the repository we !make Darknet for YOLOv4. If your make is successful, you will see a number of printouts and at the bottom you will see the line beginning with:

g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -DOPENCV pkg-config --cflags opencv4 2> /dev/null

Finally, we download the newly released convolutional neural network weights used in YOLOv4.

yolov4.conv.137     100%[===================>] 162.16M  64.2MB/s    in 2.5s    

✅ All set.

Download Our Custom Dataset for YOLOv4 and Set Up Directories

To train YOLOv4 on Darknet with our custom dataset, we need to import our dataset in Darknet YOLO format.

To import our images and bounding boxes in the YOLO Darknet format, we'll use Roboflow.

Don't have a dataset? You can also start with one of the free computer vision datasets. The dataset used in this tutorial is Blood Cell Count and Detection (BCCD), which you can fork to add to your Roboflow account.

To get your data into Roboflow, create a free Roboflow account. Upload your images and their annotations in any format (VOC XML, COCO JSON, TensorFlow Object Detection CSV, etc).

Once uploaded, select a couple preprocessing steps. We recommend auto-orient and resize to 416x416 (YOLO presumes multiples of 32).

Roboflow screenshot: BCCD Dataset.
The settings I've chosen for my example dataset, BCCD.

Next, click "Generate" to create a version of these images we will load into Colab. Optionally, provide a name for your version. Upon the images being generated, you'll be prompted to create an export. Export your images and annotations in the Darknet format. Be sure to select "show download code."

Roboflow Screenshot: YOLO v3 Darknet Download.
Export as YOLO Darknet, and "Show Download Code."

Once the download is zipped, we'll be provided a line of code to download our data anywhere we need. Copy this link, and paste it into our Colab notebook where prompted.

Terminal Screenshot: Downloading a dataset from Roboflow.
Downloading data from Roboflow - it will download in train/valid/test splits and as a combination of images and annotation txt.

If you are on local, and already have your dataset in the right format, you can use the same Roboflow link or simply copy your files into the directories manually.

Then, we run some code to move the image and annotation files into the correct directories for training.

✅ Onward.

Configure a Custom YOLOv4 Training Config File for Darknet

Configuring the training config for YOLOv4 for a custom dataset is tricky, and we handle it automatically for you in this tutorial. We'll set defaults for the learning rate and batch size below, and you should feel free to adjust these to your dataset's needs.

We set up the config by combining a series of chunked config files. We take the following steps according to the YOLOv4 repository:

  • Set batch size to 64 - batch size is the number of images per iteration
  • Set subdivisions to 12 - subdivisions are the number of pieces your batch is broken into for GPU memory.
  • max_batches to 2000 * number of classes
  • steps to 80% and 90% of max batches
  • change num_classes in all of the YOLO layers
  • change filters in all of the YOLO layers

Most of these you will not need to change. You may want to change the subdivision size to speed up training (smaller subdivisions are faster) or if your GPU does not have enough memory (larger subdivisions require less memory).

✅ Good to go!

Train Our Custom YOLOv4 Object Detector

Now that we have set up the environment, we can begin to train our custom YOLOv4 object detector.

Terminal screenshot: Training darknet.
Training Custom YOLOv4 detector... ⏰

Training will print after every iteration. The mAP will be calculated on the validation set and will print every 1000 iterations. (See our post explaining mAP if to learn more.)

Note: Training will take approximately six hours for 300 images. This is a research framework, not optimized for quick training. To speed up the time it takes the program to run try to lower the number of subdivisions and lower the max_batches.

You want to watch the "avg loss" to see if your detector is converging. Choose the weights on the iteration that achieves the best mAP calculation on your validation set.

Training...

Almost there.

Using Our Custom YOLO v4 Detector for Inference

In this section we will use your trained custom YOLO v4 detector to make inference on test images. When training, the trained weights for our detector are saved every 100 iterations in the ./backup/ directory. We can reload these weights and make inference on a test image. Remember to use the weights that achieved the highest mAP on your validation set.

Example microscope image from BCCD.
My YOLOv4 model for cell detection is the best one I have ever trained

✅ There you have it!

You have trained your own YOLO v4 model to make object detections on custom objects. I have personally found that YOLO v4 does the best among other models for my custom object detection tasks.

Saving Model Weights for Future Use

You can save your model weights by moving them from the./backup/ directory and back into your Google Drive. Then you can pick up training from those weights and re-import them for inference.

Conclusion

In this post, we have walked through training YOLOv4 on your custom object detection task. We have covered the following steps to go from zero to 100 with YOLOv4:

  • Configure our GPU environment on Google Colab
  • Install the Darknet YOLO v4 training environment
  • Download our custom dataset for YOLO v4 and set up directories
  • Configure a custom YOLO v4 training config file for Darknet
  • Train our custom YOLO v4 object detector
  • Reload YOLO v4 trained weights and make inference on test images

Please enjoy deploying the state of the art for detecting your custom objects 🚀

Stay tuned for future tutorials such as a YOLO v4 tutorial in Pytorch, YOLO v4 tutorial in TensorFlow, YOLO v4 tutorial in Keras, and comparing YOLO v4 to EfficientDet for object detection.