Creating annotated data sets with IBM Cloud Annotations

Creating annotated data sets with IBM Cloud Annotations

By the end of this tutorial, you will be able to generate datasets with annotated images that can be used in Create ML for building and training object detection machine learning models for Core ML.

To train Image Classification models is quite simple, you just need to have your dataset organized in folders and you are good to go. To train Object Detection models, on the other hand, requires a special file telling Create ML where each of the objects is located in each image of your dataset.

This special file is a JSON file named annotations.json, and trust me when I say that creating it manually is an herculean task.

Imagine having to write down the coordinates for the position of each object in each image of your dataset?

        "image": "image1.jpg",
        "annotations": [
                "label": "salmon",
                "coordinates": {
                    "x": 120, "y": 164, "width": 230, "height": 119
                "label": "tuna",
                "coordinates": {
                    "x": 230, "y": 321, "width": 50, "height": 50

annotations.json file structure

Luckily there are tools and platforms out there to support you in this task.

One of them is the IBM Cloud Annotations. It offers fast and easy image annotation in the browser and supports training machine learning models in the cloud as well as exporting datasets for Create ML.

It can be used for free, with some limitations, and should provide everything you need to create annotated datasets to train Object Detection models. Let's see how that works.

Illustration of an annotated image.

Before start working in the platform you need to collect and organize the images that will be used to train your model. You will need at least two different sets of images, one to train the model (called sushiguide-detector in this example) and a different one to test the model (called sushiguide-detector-test in this example).

Examples of folders to be used as training data.

Once you have all the images organized you can start annotating them. Here are the steps you need to follow to get started:

1. After creating an account, the first step is to start a new project.

Startign a new project on IBM Cloud Annotations

2. Name the bucket for your dataset. Provide a name that will allow you to easily identify the dataset, like flowers-training and flowers-testing.

Creating a new bucket

3. When prompted about the annotation type that bucket will host, choose Localization.

Choosing a Localization annotation type

4. The final step is to upload the images that compose your dataset to the bucket and you are ready to start annotating.

Adding the training data to your bucket

Now you are ready to start creating annotations in your images.

By clicking and dragging over an image you will create boxes over it that will inform where the object you want to detect is located in the image. You can create multiple boxes if the image contains multiple objects to be detected.

Creating annotations on an image

After annotating all the objects in the image you must label each of them on the sidebar to the right. Those labels will inform the training algorithm where each object is located in the image. Each label will be a class for the object detection model, so objects that belong to the same class should be identified with the same label.

Creating labels

Once you have finished annotating all your images you just need to export your annotated dataset from the Fil menu, choosingExport as Create ML.

Exporting the model to be sued with Create ML

The dataset will be downloaded containing all your images and the corresponding annotations.json file, containing all the metadata Create ML needs to be able to train an Object Detection model.

Downloaded data set with the annotations file

This tutorial is part of a series of articles derived from the presentation Creating Machine Learning Models with Create ML presented as a one time event at the Swift Heroes 2021 Digital Conference on April 16th, 2021.

Where to go next?

If you are interested into knowing more about Object Detection models you can check other tutorials on:

You want to know more? There is more to see...

For a deeper dive on the topic of creating object detection machine learning models, you can watch the videos released by Apple on WWDC:

Other Resources:

If you are interested into knowing more about annotating images for creating object detection machine learning models, you can go through these resources: