Releasing an Improved Blood Count and Cell Detection (BCCD) Dataset

Releasing an Improved Blood Count and Cell Detection (BCCD) Dataset

Computer vision is revolutionizing medical diagnoses by assisting doctors with patterns they may not have seen or identifying an error they may have overlooked.

Thus, it's unsurprising one of the more popular "hello world" datasets of object detection is the blood count and cell detection dataset (BCCD). Now two years old, this is a dataset of blood cells photos, originally open sourced by cosmicad and akshaylambda. There are 364 images across three classes: WBC (white blood cells), RBC (red blood cells), and Platelets.

Upon examining this dataset, however, the Roboflow team discovered there is room for improved labelling.

Here's an original image and raw labels that appears to be comprehensive:

Everything is labeled in this original!

Yet here's another original image and its raw labels that is clearly missing bounding boxes:

Only platelets are labeled here!

Now, fair warning, the Roboflow team are not doctors or domain experts and do not claim to have cell biology expertise. However, in reviewing the original 364 microscope image examples, there were examples like the one above where labels can be intuitively added.

Upon reviewing and relabeling, the Roboflow team added 187 labels: 183 RBC, three WBC, and one Platelets. That dataset is freely available here.

(If you're looking to build an object detection model leveraging this dataset, be sure to check our tutorial available here!)

We'll be running tests on the importance of missing labels shortly. Stay tuned...


Want to be the first to know about new content like this? Subscribe.

Roboflow accelerates your computer vision workflow through automated annotation quality assurance, universal annotation format conversion (like PASCAL VOC XML to COCO JSON), team sharing and versioning, and exports directly to file format, like TFRecords. It's free for datasets up to 1GB.

Show Comments