Roboflow simplifies your computer workflow from data organization, annotation verification, preprocessing, augmentations to exporting to your required model format.

Specifically, you can use Roboflow to:

  • Convert VOC XML annotations to COCO JSON annotations (and vice versa)
  • See if your labels are in-frame (and one-click correct them if they are not)
  • Preprocess images: resizing, grayscale, auto-orientation, contrast adjustments
  • Augment images to increase your training data: flip, rotate, brighten / darken, crop, shear, blur, and add random noise
  • Generate annotation formats like TFRecords, CreateML and Turi Create, and custom YOLOv3 implementations (flat text files or Darknet)
  • Version datasets and share them with your team
  • Share datasets across your organization
  • Easily use your data across models built in Tensorflow, PyTorch,, Keras, and more.
  • Obtain and share public datasets

Roboflow eliminates the papercuts inhibiting better, faster object detection models.

Adding Data

Let’s walkthrough a tutorial on managing images for a chess piece detection problem.

To get started, create an account using your email or GitHub account:

After reviewing and accepting the terms of service, you’ll be presented with your datasets homepage:

Roboflow Screenshot: Empty datastes page (Ready to create your first dataset?)

For this walkthrough, we’ll use the Roboflow-provided sample dataset. Near the center of the page, select, “Download a sample dataset to test it out.”

Here, we automatically take care of (1) creating and naming your dataset “Chess Sample” (2) giving your annotations a name “Pieces.” You’ve also downloaded a zip file containing 20 chess images and their annotations across the 12 different pieces.

As the guided tutorial suggests, click “Create Dataset.”

Now, unzip the sample file we just downloaded to your computer, Click and drag the folder called “chess-tutorial-dataset” from your local machine onto the highlighted upload area.

As an aside, feel free to poke around the contents of chess-tutorial-dataset on your computer so you can see what’s inside:

MacOS Finder Screenshot: annotations folder (containing xml files), img folder (containing JPG images)
We’ve provided 12 chess images and 12 VOC XML annotations. While these are VOC XML, note Roboflow supports most every annotation format.

Once you drop the chess-tutorial-dataset folder into Roboflow, the images and annotations are processed for you to see them overlayed. If any of your annotations have errors (as is the case in this example we gave you!), Roboflow alerts you. In this case, some of the annotations improperly extended beyond the frame of an image. Roboflow intelligently crops the edge of the annotation to line up with the edge of the image and drops erroneous annotations that lie fully outside the image frame.

At this point, our images have not yet been uploaded to Roboflow. We can verify that all the images are, indeed, the ones we want to include in our dataset and that our annotations are being parsed properly. Any image can be deleted upon mousing over it and selecting the trash icon.

Everything now looks good. Click “Start Upload” in the upper right hand corner! (This step does work better with faster internet.)

Preprocessing and Augmentations

Once the upload finishes, you’re directed to the dataset detail page for Chess Sample.

Roboflow Screenshot: Dataset detail page ("Chess Sample")
This is the “mission control” for your dataset.

From here, we can apply any preprocessing and augmentation steps that we want to our images. Roboflow seamlessly makes sure all of your annotations correctly bound each of your labeled objects -- even if you resize, rotate, or crop.

By default, Roboflow opts you in to two preprocessing steps: auto-orient and resize. Auto-orient assures your images are stored on disk the same way your applications open them for you. (If you’re unfamiliar, this is can be a silent killer of computer vision models). Resize creates a consistent size for your images (in this case, smaller, to expedite training).

You’ll see Roboflow supports auto-orient corrections, resizing, grayscaling, contrast adjustments, random flips, random 90-degree rotations, random 0 to N degree rotations, random brightness modifications, Guassian blurring, random shearing, random cropping, and random noise. To better understand these options, refer to our documentation.

Why You Should Augment Data Before Training

If you’ve worked with computer vision, you’ll note augmentations are typically performed at the time of training your models. Augmenting before training has three notable advantages.

First, it improves model reproducibility, easing debugging and performance improvement. For example, your model might quietly perform better on images captured in brighter rooms. By having a copy of the darker augmentations, debugging with inference is far easier.

Second, augmenting first reduces your training time and cost. When training with GPUs, augmentations are CPU-constrained operations. Thus, your GPUs are kept waiting for CPU operations to finish augmentations, increasing your training time and costs. (Saving around 12% training costs in our tests!)

Third, instead of (re)writing your augmentation scripts (or model config files) for each different model you train, you can have a consistent set of augmentations and images to compare across models head-to-head.

Exporting Data from Roboflow for Training

For our walkthrough, we’ll leave settings as they were: auto-orient and resize 416x416. To create downloaded data, select “Export” in the upper right hand corner.

Roboflow Screenshot: Export Dataset dialog (Export Name: 416x416-auto-orient)
You can choose to name this export, or leave the name field blank to default to naming with a timestamp.

After optionally naming, select Export. Roboflow is now preparing each of your images and annotations for download.

Once this finishes, you’ll be directed to this specific Export’s format (note the menu on the left-hand corner now includes an archive of this specific version), and “Download Dataset” dialogue is available.

Select the annotation format you need: CreateML JSON, Pascal VOC XML, YoloV3 Darknet or flat text file, a TensorFlow Object Detection CSV, or TensorFlow Records.

You may download this data locally as a zipped file by selecting “Download to your computer” or you can have Roboflow provide you with code to import the data directly into your Jupyter Notebook (including Colab) or a Python script.

For this walkthrough, let’s select “Pascal VOC” and download the data to our computer so that we can see the resulting images. A zip file downloads. It contains an export folder with your images and annotations and a README describing the transformations you provided to your data.

Now, for the sake of example, go back to the Roboflow export you just created. This time, ask Roboflow to show you the download code for Jupyter Notebook rather than local to your computer.

Roboflow Screenshot: Download Dataset dialog (Jupyter notebook download snippets)
Note I’ve obscured my API keys here. You should not share your keys.

You’re now ready to use your data to train a custom object detection model! You may want to select one from our model library which contains ready-to-go Jupyter notebooks in frameworks like PyTorch, Keras, and Tensorflow that you can run for free from right within Google Colab.

Reach out with any questions, like supporting an annotation format or feature requests:

We’re excited to see what you build with Roboflow!