VOC2012 Segmentation#

This topic describes how to manage the VOC2012 Segmentation dataset, which is a dataset with SemanticMask and InstanceMask labels (Fig. 4 and Fig. 5).


Fig. 4 The preview of a semantic mask from “VOC2012 Segmentation”.#


Fig. 5 The preview of a instance mask from “VOC2012 Segmentation”.#

Authorize a Client Instance#

An accesskey is needed to authenticate identity when using TensorBay.

from tensorbay import GAS

# Please visit `https://gas.graviti.cn/tensorbay/developer` to get the AccessKey.

Create Dataset#


Organize Dataset#

Normally, dataloader.py and catalog.json are required to organize the “VOC2012 Segmentation” dataset into the Dataset instance. In this example, they are stored in the same directory like:

VOC2012 Segmentation/

It takes the following steps to organize “VOC2012 Segmentation” dataset by the Dataset instance.

Step 1: Write the Catalog#

A Catalog contains all label information of one dataset, which is typically stored in a json file like catalog.json.

 2    "SEMANTIC_MASK": {
 3        "categories": [
 4            { "name": "background", "categoryId": 0 },
 5            { "name": "aeroplane", "categoryId": 1 },
 6            { "name": "bicycle", "categoryId": 2 },
 7            { "name": "bird", "categoryId": 3 },
 8            { "name": "boat", "categoryId": 4 },
 9            { "name": "bottle", "categoryId": 5 },
10            { "name": "bus", "categoryId": 6 },
11            { "name": "car", "categoryId": 7 },
12            { "name": "cat", "categoryId": 8 },
13            { "name": "chair", "categoryId": 9 },
14            { "name": "cow", "categoryId": 10 },
15            { "name": "diningtable", "categoryId": 11 },
16            { "name": "dog", "categoryId": 12 },
17            { "name": "horse", "categoryId": 13 },
18            { "name": "motorbike", "categoryId": 14 },
19            { "name": "person", "categoryId": 15 },
20            { "name": "pottedplant", "categoryId": 16 },
21            { "name": "sheep", "categoryId": 17 },
22            { "name": "sofa", "categoryId": 18 },
23            { "name": "train", "categoryId": 19 },
24            { "name": "tvmonitor", "categoryId": 20 },
25            { "name": "void", "categoryId": 255 }
26        ]
27    },
28    "INSTANCE_MASK": {
29        "categories": [
30            { "name": "background", "categoryId": 0 },
31            { "name": "void", "categoryId": 255 }
32        ]
33    }

The annotation types for “VOC2012 Segmentation” are SemanticMask and InstanceMask, and there are 22 category types for SemanticMask. There are 2 category types for InstanceMask, category 0 represents the background, and category 255 represents the border of instances.


  • By passing the path of the catalog.json, load_catalog() supports loading the catalog into dataset.

  • The categories in InstanceMaskSubcatalog are for pixel values which are not instance ids.


See catalog table for more catalogs with different label types.

Step 2: Write the Dataloader#

A dataloader is needed to organize the dataset into a Dataset instance.

 1#!/usr/bin/env python3
 3# Copyright 2021 Graviti. Licensed under MIT License.
 5# pylint: disable=invalid-name
 7"""Dataloader of VOC2012Segmentation dataset."""
 9import os
11from tensorbay.dataset import Data, Dataset
12from tensorbay.label import InstanceMask, SemanticMask
14_SEGMENT_NAMES = ("train", "val")
15DATASET_NAME = "VOC2012Segmentation"
18def VOC2012Segmentation(path: str) -> Dataset:
19    """`VOC2012Segmentation <http://host.robots.ox.ac.uk/pascal/VOC/voc2012/>`_ dataset.
21    The file structure should be like::
23        <path>/
24            JPEGImages/
25                <image_name>.jpg
26                ...
27            SegmentationClass/
28                <mask_name>.png
29                ...
30            SegmentationObject/
31                <mask_name>.png
32                ...
33            ImageSets/
34                Segmentation/
35                    train.txt
36                    val.txt
37                    ...
38                ...
39            ...
41    Arguments:
42        path: The root directory of the dataset.
44    Returns:
45        Loaded :class: `~tensorbay.dataset.dataset.Dataset` instance.
47    """
48    root_path = os.path.abspath(os.path.expanduser(path))
50    image_path = os.path.join(root_path, "JPEGImages")
51    semantic_mask_path = os.path.join(root_path, "SegmentationClass")
52    instance_mask_path = os.path.join(root_path, "SegmentationObject")
53    image_set_path = os.path.join(root_path, "ImageSets", "Segmentation")
55    dataset = Dataset(DATASET_NAME)
56    dataset.load_catalog(os.path.join(os.path.dirname(__file__), "catalog.json"))
58    for segment_name in _SEGMENT_NAMES:
59        segment = dataset.create_segment(segment_name)
60        with open(os.path.join(image_set_path, f"{segment_name}.txt"), encoding="utf-8") as fp:
61            for stem in fp:
62                stem = stem.strip()
63                data = Data(os.path.join(image_path, f"{stem}.jpg"))
64                label = data.label
65                mask_filename = f"{stem}.png"
66                label.semantic_mask = SemanticMask(os.path.join(semantic_mask_path, mask_filename))
67                label.instance_mask = InstanceMask(os.path.join(instance_mask_path, mask_filename))
69                segment.append(data)
71    return dataset

See SemanticMask annotation and InstanceMask annotation for more details.

There are already a number of dataloaders in TensorBay SDK provided by the community. Thus, in addition to writing, importing an available dataloader is also feasible.

from tensorbay.opendataset import VOC2012Segmentation

dataset = VOC2012Segmentation("<path/to/dataset>")


Note that catalogs are automatically loaded in available dataloaders, users do not have to write them again.


See dataloader table for dataloaders with different label types.

Upload Dataset#

The organized “VOC2012 Segmentation” dataset can be uploaded to tensorBay for sharing, reuse, etc.

dataset_client = gas.upload_dataset(dataset, jobs=8)
dataset_client.commit("initial commit")

Similar with Git, the commit step after uploading can record changes to the dataset as a version. If needed, do the modifications and commit again. Please see Version Control for more details.

See the visualization on TensorBay website.

Read Dataset#

Now “VOC2012 Segmentation” dataset can be read from TensorBay.

dataset = Dataset("VOC2012Segmentation", gas)

In dataset “VOC2012 Segmentation”, there are two segments: train and val. Get a segment by passing the required segment name or the index.

segment_names = dataset.keys()
segment = dataset[0]

In the train segment, there is a sequence of data, which can be obtained by index.

data = segment[0]

In each data, there are one SemanticMask annotation and one InstanceMask annotation.

from PIL import Image

label_semantic_mask = data.label.semantic_mask
semantic_all_attributes = label_semantic_mask.all_attributes
semantic_mask = Image.open(label_semantic_mask.open())

label_instance_mask = data.label.instance_mask
instance_all_attributes = label_instance_mask.all_attributes
instance_mask_url = label_instance_mask.get_url()

There are two label types in “VOC2012 Segmentation” dataset, which are semantic_mask and instance_mask. We can get the mask by Image.open() or get the mask url by get_url(). The information stored in SemanticMask.all_attributes is attributes for every category in categories list of SEMANTIC_MASK. The information stored in InstanceMask.all_attributes is attributes for every instance. See SemanticMask and InstanceMask label formats for more details.

Delete Dataset#
