Creating an object detection dataset using Label Studio with YOLO ML backend

1. Overview

The dataset was created by annotating photographs of wild birds taken at the Toba River near my house with object detection annotations. The type of birds photographed and the rectangular areas surrounding them are annotated.

Label Studio was used for annotation. Label Studio has the function to work with machine learning models such as YOLO, and this function was used to annotate the data.

This page describes the procedure.

The version of Label Studio used is v1.15.0 and the version of the Label Studio ML backend used for collaborating with machine learning models is 2.0.1dev0.

The images of wild birds were taken with a Sony Alpha 6000 digital camera and resized to 640 x 427 pixels from 6,000 x 4,000 pixels.

2. Install and configure Label Studio

I installed Label Studio on Ubuntu 24.04 using WSL2 for Windows 11.

2.1. Install and launch Label Studio

I installed and started Label Studio with the following command. The python version used is 3.12.3 and pip version is 24.0.

Note:
This is a separate issue from the Label Studio installation, but when I installed the above version of pip on Ubuntu 24.04, I prepared venv environment to use pip by following the steps below. This is because when I tried to install Label Studio with “pip install label-studio” immediately after installing pip, I got a message about the PEP 668 specification and could not install it as is.

First, the following command was executed to enable the use of venv after installing pip.

$ sudo apt-get update
$ sudo apt-get upgrade
$ sudo apt-get install python3-pip
$ sudo apt-get install python3-full

Next, the following command was executed at the directory /home/username.

$ python3 -m venv .python3_venv

I then added the following line to the end of /home/username/.bashrc. I logged out and logged in again.

source $HOME/.python3_venv/bin/activate

When I logged back in, the terminal prompt changed from a prompt like the example below

username@LAPTOP-LQJ83VHB:~$

to the prompt like the example below.

(.python3_venv) username@LAPTOP-LQJ83VHB:~$

$ pip install label-studio
$ label-studio start &

After executing the above command, open http://localhost:8080/ in the browser to use Label Studio from the browser.

Click on the Sign Up link in the lower right corner of the first image and create an account on the page in the second image. After creating an account, a page like the third image will be shown.

2.2. Project Creation

On the page of the first image, click Create Project, and on the page of the second image, enter the necessary information in Project Name, Description, and click Save. The page in the third image will appear.

2.3. Loading images placed in a folder on PC

Place the images to be loaded in the C:\dev\data\tobagawa\tobagawa-2025-01\images folder on Windows 11.
This folder is referenced by Ubuntu on WSL2 as the directory /mnt/c/dev/data/tobagawa/tobagawa-2025-01/images.

Stop Label Studio once to load the images placed in the folder on the PC according to the procedure described in Local Storage.

From the Ubuntu terminal where you launched Label Studio, enter the fg command and Ctrl + C to stop Label Studio. The terminal will look like the following.

(.python3_venv) username@LAPTOP-LQJ83VHB:~$ fg
label-studio start
^C(.python3_venv) username@LAPTOP-LQJ83VHB:~$

Next, add the following lines to the end of /home/username/.bashrc. Then log out and log in to Ubuntu 24.04. The images to be loaded were placed in a subdirectory of the directory specified in LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT.

export LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/mnt/c/dev/data/tobagawa/tobagawa-2025-01

After re-logging in, start Label Studio again with the following command.

$ label-studio start &

After restarting Label Studio, the page in the first image will appear. Selecting “Project” will take you to the page in the second image. Click Settings in the upper right corner to go to the page in the third image.

Selecting Cloud Storage on the left side of the settings screen will take you to the page with the first image in the image list below.

Click Add Source Storage and enter the information as shown in the second image. Select Local files in Storage Type and enter an appropriate title in Storage Title. Enter the path of the directory where you placed the image in “Absolute local path”. The directory specified here should be a subdirectory of the directory specified in LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT. In the image, it disappears in the middle, but in the example specified here, it is /mnt/c/dev/data/tobagawa/tobagawa-2025-01/images. Enter a regular expression for the name of the target image file in the File Filter Regex. This example targets files with the JPG extension. Enable “Treat every bucket object as a source file” since I targeted JPG image files. Click “Add Storage” or “Save” to close the pop-up window.

Click “Add Target Storage” and enter the information as shown in the third image on the page.
Enter the path of the directory where you placed the image in the “Absolute local path” field.

Clicking the “Sync Storage” button for the added “Source Storage” and “Target Storage” will take you to the page in the fourth image.

2.4. Annotation of wild bird object detection (without ML backend)

Select “Labeling Interface” on the left side of the settings screen to go to the page in the first image. Click “Browse Templates” and select “Object Detection with Bounding Boxes” (second image). Enter label names in the “Add label names” field and click Add (third image).

Click on the project name Tobagawa-2025-01-with-YOLO at the top of the screen in the settings page to go to the project page (first image).

Click on one of the images and annotate it. After selecting the label name at the bottom of the image, a rectangular area annotation can be added by mouse dragging (second image).

Click Submit to save the annotation.

3. Configure Ultralytics YOLO as a Label Studio ML backend

Using Label Studio’s ML backend, you can annotate using the results of machine learning model estimation.

The procedure described in the YOLO ML backend for Label Studio was used to annotate with the estimation results of the YOLO object detection neural network.

3.1. Launch YOLO ML backend with docker compose

Open an Ubuntu 24.04 terminal in a different Windows Terminal tab than the one you started Label Studio with the following command. (see image below)

$ label-studio start &

Note:
This is not the part of Label Studio and ML backend, but if you have not yet installed Docker and Docker Compose, follow the instructions on the following page.

https://docs.docker.com/engine/install/ubuntu/

# Add Docker's official GPG key:
$ sudo apt-get update
$ sudo apt-get install ca-certificates curl
$ sudo install -m 0755 -d /etc/apt/keyrings
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
$ sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
$ echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
$ sudo apt-get update

# Install the Docker packages
$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Execute the following commands in a new tab.

$ git clone https://github.com/HumanSignal/label-studio-ml-backend.git
$ cd label-studio-ml-backend/label_studio_ml/examples/yolo

Set values to LABEL_STUDIO_URL and LABEL_STUDIO_API_KEY in docker-compose.yml placed in the yolo directory.

LABEL_STUDIO_URL is the URL to open Label Studio in a browser. However, when accessing from Docker, it is not possible to specify localhost for access.

Use the ifconfig command to find out the IP address corresponding to docker0 and set it as shown below. In the following example, LABEL_STUDIO_URL is set to http://172.17.0.1:8080.

(.python3_venv) fukagai@LAPTOP-LQJ83VHB:~/label-studio-ml-backend/label_studio_ml/examples/yolo$ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:26:b4:7a:e5  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

LABEL_STUDIO_API_KEY is referenced with the following procedure.

Click on the user icon in the upper right corner of the page and select Account & Settings (first image). The Access Token shown in the yellow-green filled area in the second image is the string to be set in LABEL_STUDIO_API_KEY.

Open docker-compose.yml in an editor, set values for LABEL_STUDIO_URL and LABEL_STUDIO_API_KEY, and then execute the following command.

$ sudo docker compose up --build

3.2. Label Studio setup with ML backend

In Label Studio, open a browser, go to Settings – Labeling Interface and configure it to use the ML backend.

In the settings below, “bird” is the label name of YOLO ML backend. The object detection result detected by “bird” will later be changed to either カルガモ, オオバン, マガモ(オス), ダイサギ, アオサギ, or カワウ.

<View>
  <Image name="image" value="$image"/>
  <RectangleLabels name="label" toName="image" model_score_threshold="0.25">
    <Label value="bird" background="red"/>
    <Label value="カルガモ" background="brown"/>
    <Label value="オオバン" background="gray"/>
    <Label value="マガモ（オス）" background="green"/>
    <Label value="ダイサギ" background="white"/>
    <Label value="アオサギ" background="blue"/>
    <Label value="カワウ" background="black"/>
  </RectangleLabels>
</View>

Next, in Label Studio, go to Settings – Model and configure to use the ML backend.

On the first image page, click Connect Model.

On the second image page, enter the information as shown in the image example. The Backend URL is http://localhost:9090 and Interactive preannotations are enabled. Then click Validate and Save.

It will look like the page in the third image.

4. Creating a dataset for YOLO object detection with Label Studio

4.1. Annotation of bird object detection (with ML backend)

Return to the Label Studio project and select the image.

As shown in the first image, the birds are initially detected by the trained object detection model. After clicking on the detected bird rectangle area, you can label the bird type by clicking on the bird type label.

By holding down the Ctrl key and clicking on a rectangular area, you can select multiple rectangular areas as shown in the second image. If you select “カルガモ” in the second image, it will look like the third image.

As shown in the first image below, after selecting a rectangular area, you can also modify the range of the rectangular area. Check the Regions in the lower right corner of the second image for extra labels.

Birds that cannot be detected by the trained object detection model are labeled manually with a rectangular region. A wrongly detected bird rectangle can be removed by selecting the rectangle and clicking Backspace.

Click Submit to save the annotation.

4.2. Remove unused images from the project

Delete unused images imported into the Label Studio project by following the steps below.

First, select the checkbox of the image to be deleted on the Image List page (first image). Next, click on Tasks in the upper left corner of the screen and select Delete Tasks (second image).

4.3. Output dataset (YOLO format) for object detection with Label Studio

Once all image data has been annotated, output the dataset for object detection.

Click the Export button in the upper right corner of the Label Studio project page (first image). To output a dataset of object detection in YOLO format, select YOLO and click the Export button in the lower right corner (second image).

A compressed file of the dataset will be added to the C:\Users\username\Downloads folder on Windows 11.

5. Training YOLO’s object detection model with the created dataset

5.1. Preparing datasets for training

The compressed file of the dataset added to the Donwloads folder in 4.3. above is named something like project-1-at-2025-01-25-10-43-f2cc8f57.zip. When this is expanded, it consists of the following.

- project-1-at-2025-01-25-10-43-f2cc8f57/
                ├─ classes.txt
                ├─ notes.json
                ├─ images/
                │   ├─ DSC00784.JPG
                │   ├─ DSC00785.JPG
                │   ├─ ...
                │   └─ DSC01432.JPG
                └─ labels/
                    ├─ DSC00784.txt
                    ├─ DSC00785.txt
                    ├─ ...
                    └─ DSC01432.txt

classes.txt is a text file with the following contents.

bird
アオサギ
オオバン
カルガモ
カワウ
ダイサギ
ヒドリガモ（オス）
ヒドリガモ（メス）
マガモ（オス）
マガモ（メス）

Note:
The example YOLO format dataset shown here is a dataset output with slightly different settings than the Labeling Interface settings described in 3.2. above. (The only difference is the number, type, and order of labels.)

Based on the above data, I prepared the following data.

First, the files in the images and labels directories of project-1-at-2025-01-25-10-43-f2cc8f57 were moved to tobagawa-2025-01/images/train and tobagawa-2025-01/labels/train, respectively.

Next, some of the image files that were placed in tobagawa-2025-01/images/train were moved to tobagawa-2025-01/images/val and tobagawa-2025-01/images/test directories.
In line with this, files with the same names as the image files except for the extensions, which were placed in the tobagawa-2025-01/labels/train directory, were moved to tobagawa-2025-01/labels/val and tobagawa-2025-01/labels/test directories.

The directory structure of the prepared dataset is as follows.

- tobagawa-2025-01/
      ├─ data.yaml
      ├─ images/
      │   ├─ train/
      │   │   ├─ DSC00784.JPG
      │   │   ├─ DSC00785.JPG
      │   │   ├─ ...
      │   │   └─ DSC01432.JPG
      │   ├─ val/
      │   │   ├─ DSC00801.JPG
      │   │   ├─ DSC00811.JPG
      │   │   ├─ ...
      │   │   └─ DSC01399.JPG
      │   └─ test/
      │       ├─ DSC00804.JPG
      │       ├─ DSC00818.JPG
      │       ├─ ...
      │       └─ DSC01390.JPG
      └─ labels/
          ├─ train/
          │   ├─ DSC00784.txt
          │   ├─ DSC00785.txt
          │   ├─ ...
          │   └─ DSC01432.txt
          ├─ val/
          │   ├─ DSC00801.txt
          │   ├─ DSC00811.txt
          │   ├─ ...
          │   └─ DSC01399.txt
          └─ test/
              ├─ DSC00804.txt
              ├─ DSC00818.txt
              ├─ ...
              └─ DSC01390.txt

data.yaml is a text file with the following contents. The file is UTF-8 character encoding and CR LF newline encoding.

path: /mnt/c/dev/data/custom/tobagawa/tobagawa-2025-01  # dataset root dir
train: images/train  # train images (relative to 'path')
val: images/val  # val images (relative to 'path')
test: images/test  # test images (relative to 'path')

# Classes
names:
  0: bird
  1: アオサギ
  2: オオバン
  3: カルガモ
  4: カワウ
  5: ダイサギ
  6: ヒドリガモ（オス）
  7: ヒドリガモ（メス）
  8: マガモ（オス）
  9: マガモ（メス）

/mnt/c/dev/data/custom/tobagawa/tobagawa-2025-01 is the path of the directory where data.yaml is located.

The label names were arranged in the same order as they appeared in classes.txt. Each label name is preceded by a number, starting from 0 and incrementing by one, and a colon symbol “:”.

5.2. Confirmation of training and prediction of the YOLO object detection model

As with the method described in this link, I confirmed that it works in an environment with the latest Ultralytics YOLO installed on Ubuntu.

Since I checked with the latest Ultralytics YOLO, I’m able to use YOLO11, unlike the link above.

Note:
The following procedure used an ESPRIMO WD2/H2 desktop computer with an NVIDIA GeForce GTX 1650 (GPGPU). The program was run on Ubuntu 22.04 using WSL2.

I used a PC whose specifications is shown at the bottom of this page.

5.2.1. YOLO Object Detection Model training

Using the created dataset, I trained the YOLO object detection model to estimate bird type and location.

The following commands were used to train the YOLO object detection model.

$ yolo train model=yolo11n.pt data=/mnt/c/dev/data/custom/tobagawa/tobagawa-2025-01/data.yaml batch=-1 epochs=40

The yolo11n.pt specified in model is an object detection model that has already been trained on other datasets. yolo11n.pt is a relatively small model. I trained it on a new dataset.

The file /mnt/c/dev/data/custom/tobagawa/tobagawa-2025-01/data.yaml specified as “data” is the file in the dataset prepared in 5.1. above.

In batch, -1 is specified. If this value is specified, the appropriate batch size is selected based on the GPU memory size of the GPGPU being used.

epochs specifies the number of epochs. The number of epochs is the number of times the training is repeated given a dataset. I checked the results for three different epochs (10, 20, and 40) and confirmed that the prediction results improve as the number of epochs is increased.

5.2.2. Confirmation of prediction results using the trained YOLO object detection model

Prediction by the trained YOLO object detection model was executed with the following command.

$ yolo predict model=./runs/detect/train5/weights/best.pt source=/mnt/c/dev/data/custom/tobagawa/tobagawa-2025-01/images/test/

The ./runs/detect/train5/weights/best.pt specified as “model” is the object detection model obtained by the training described in 5.2.1. above. Which path the object detection model with updated weights is stored in training depends on the environment. Since this is the fifth training result in my environment, train5 is included in the path.

The /mnt/c/dev/data/custom/tobagawa/tobagawa-2025-01/images/test/ specified as “source” is the path to the directory where the test images are located. When training was executed in 5.2.1. above, images in images/train and images/val and labels in labels/train and labels/val were referenced. Images in images/test are images that are not referenced in training.

The images below show the 10 detection results output by the above command. In some areas, a single bird is detected as multiple birds, but the detection results are mostly correct.

I placed the compressed file “tobagawa-2025-01.zip” at this link.

（The images placed in images/train, images/val, and images/test are all similar images taken from the bank of the Toba River. I would like to prepare a more diverse set of images to experiment with.）