ScanNet for 3D Semantic Segmentation¶

Dataset preparation¶

The overall process is similar to ScanNet 3D detection task. Please refer to this section. Only a few differences and additional information about the 3D semantic segmentation data will be listed below.

Export ScanNet data¶

Since ScanNet provides online benchmark for 3D semantic segmentation evaluation on the test set, we need to also download the test scans and put it under scannet folder.

The directory structure before data preparation should be as below:

mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│   ├── scannet
│   │   ├── meta_data
│   │   ├── scans
│   │   │   ├── scenexxxx_xx
│   │   ├── scans_test
│   │   │   ├── scenexxxx_xx
│   │   ├── batch_load_scannet_data.py
│   │   ├── load_scannet_data.py
│   │   ├── scannet_utils.py
│   │   ├── README.md

Under folder scans_test there are 100 test folders in which only raw point cloud data and its meta file are saved. For instance, under folder scene0707_00 the files are as below:

scene0707_00_vh_clean_2.ply: Mesh file storing coordinates and colors of each vertex. The mesh’s vertices are taken as raw point cloud data.
scene0707_00.txt: Meta file including sensor parameters, etc. Note: different from data under scans, axis-aligned matrix is not provided for test scans.

Export ScanNet data by running python batch_load_scannet_data.py. Note: only point cloud data will be saved for test set scans because no annotations are provided.

Create dataset¶

Similar to the 3D detection task, we create dataset by running python tools/create_data.py scannet --root-path ./data/scannet --out-dir ./data/scannet --extra-tag scannet. The directory structure after processing should be as below:

scannet
├── scannet_utils.py
├── batch_load_scannet_data.py
├── load_scannet_data.py
├── scannet_utils.py
├── README.md
├── scans
├── scans_test
├── scannet_instance_data
├── points
│   ├── xxxxx.bin
├── instance_mask
│   ├── xxxxx.bin
├── semantic_mask
│   ├── xxxxx.bin
├── seg_info
│   ├── train_label_weight.npy
│   ├── train_resampled_scene_idxs.npy
│   ├── val_label_weight.npy
│   ├── val_resampled_scene_idxs.npy
├── scannet_infos_train.pkl
├── scannet_infos_val.pkl
├── scannet_infos_test.pkl

seg_info: The generated infos to support semantic segmentation model training.
- train_label_weight.npy: Weighting factor for each semantic class. Since the number of points in different classes varies greatly, it’s a common practice to use label re-weighting to get a better performance.
- train_resampled_scene_idxs.npy: Re-sampling index for each scene. Different rooms will be sampled multiple times according to their number of points to balance training data.

Training pipeline¶

A typical training pipeline of ScanNet for 3D semantic segmentation is as below:

train_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='DEPTH',
        shift_height=False,
        use_color=True,
        load_dim=6,
        use_dim=[0, 1, 2, 3, 4, 5]),
    dict(
        type='LoadAnnotations3D',
        with_bbox_3d=False,
        with_label_3d=False,
        with_mask_3d=False,
        with_seg_3d=True),
    dict(
        type='PointSegClassMapping',
        valid_cat_ids=(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 24, 28,
                       33, 34, 36, 39),
        max_cat_id=40),
    dict(
        type='IndoorPatchPointSample',
        num_points=num_points,
        block_size=1.5,
        ignore_index=len(class_names),
        use_normalized_coord=False,
        enlarge_size=0.2,
        min_unique_num=None),
    dict(type='NormalizePointsColor', color_mean=None),
    dict(type='DefaultFormatBundle3D', class_names=class_names),
    dict(type='Collect3D', keys=['points', 'pts_semantic_mask'])
]

PointSegClassMapping: Only the valid category ids will be mapped to class label ids like [0, 20) during training. Other class ids will be converted to ignore_index which equals to 20.
IndoorPatchPointSample: Crop a patch containing a fixed number of points from input point cloud. block_size indicates the size of the cropped block, typically 1.5 for ScanNet.
NormalizePointsColor: Normalize the RGB color values of input point cloud by dividing 255.

Metrics¶

Typically mean Intersection over Union (mIoU) is used for evaluation on ScanNet. In detail, we first compute IoU for multiple classes and then average them to get mIoU, please refer to seg_eval.

Testing and Making a Submission¶

By default, our codebase evaluates semantic segmentation results on the validation set. If you would like to test the model performance on the online benchmark, add --format-only flag in the evaluation script and change ann_file=data_root + 'scannet_infos_val.pkl' to ann_file=data_root + 'scannet_infos_test.pkl' in the ScanNet dataset’s config. Remember to specify the txt_prefix as the directory to save the testing results.

Taking PointNet++ (SSG) on ScanNet for example, the following command can be used to do inference on test set:

./tools/dist_test.sh configs/pointnet2/pointnet2_ssg_16x2_cosine_200e_scannet_seg-3d-20class.py \
    work_dirs/pointnet2_ssg/latest.pth --format-only \
    --eval-options txt_prefix=work_dirs/pointnet2_ssg/test_submission

After generating the results, you can basically compress the folder and upload to the ScanNet evaluation server.