Skip to content
This repository was archived by the owner on Jan 1, 2025. It is now read-only.
This repository was archived by the owner on Jan 1, 2025. It is now read-only.

Custom Dataset structure guide on setting Class IDs and num_classes #247

@niqbal996

Description

@niqbal996

Hello,

Thanks for your work and repository. I want to use it to train my custom dataset. My dataset has the following structure. Its based on the Phenorob agricultural dataset from here.

Soil: Category ID 0 (Stuff)
Crop: Category ID 1 (Thing)
Weed: Category ID 2 (Thing)

I want to detect all three classes using both semantic and panoptic segmentation.

Now in my yaml config file, I have set num_classes = 3 SEM_SEG_HEAD like this:

SEM_SEG_HEAD:
    NAME: "MaskFormerHead"
    IGNORE_VALUE: 255
    NUM_CLASSES: 3
    LOSS_WEIGHT: 1.0
    CONVS_DIM: 256
    MASK_DIM: 256
    NORM: "GN"

I am using the COCO Panoptic and SEMSeg Evaluator for evaluation purposes. In the semantic segmentation png label files, my png files have following pixel mappings:

0 -> soil
1 -> crop
2 -> weed

And the categories in the json file are as below:

"categories": [
        {
            "color": [
                0,
                0,
                0
            ],
            "id": 0,
            "isthing": 0,
            "name": "soil",
            "supercategory": "soil"
        },
        {
            "color": [
                111,
                74,
                0
            ],
            "id": 1,
            "isthing": 1,
            "name": "crop",
            "supercategory": "crop"
        },
        {
            "color": [
                230,
                150,
                140
            ],
            "id": 2,
            "isthing": 1,
            "name": "weed",
            "supercategory": "weed"
        }
    ],

The dataset is registered in the following manner:

meta = {}

  # Define classes and colors
  thing_classes = ["crop", "weed"]
  thing_colors = [(0, 0, 200), (200, 0, 0)]
  stuff_classes = ["soil"]
  stuff_colors = [(0, 0, 0)]

  meta["thing_classes"] = thing_classes
  meta["thing_colors"] = thing_colors
  meta["stuff_classes"] = stuff_classes
  meta["stuff_colors"] = stuff_colors

  # Map dataset IDs to contiguous IDs
  meta["thing_dataset_id_to_contiguous_id"] = {1: 1, 2: 2}  # 1 -> crop, 2 -> weed
  meta["stuff_dataset_id_to_contiguous_id"] = {0: 0}  # 0 -> soil

  # Set ignore label
  meta["ignore_label"] = 255

  # Additional metadata for visualization and evaluation
  meta["stuff_classes"] = stuff_classes + thing_classes
  meta["stuff_colors"] = stuff_colors + thing_colors
  meta["stuff_dataset_id_to_contiguous_id"].update(meta["thing_dataset_id_to_contiguous_id"])

  return meta

############################################################
DatasetCatalog.register(
        panoptic_name,
        lambda: merge_to_panoptic(
            load_pheno_panoptic_json(panoptic_json, image_root, panoptic_root, metadata),
            load_sem_seg(sem_seg_root, image_root, gt_ext='png', image_ext='png'),
        ),
    )
    MetadataCatalog.get(panoptic_name).set(
        panoptic_root=panoptic_root,
        image_root=image_root,
        panoptic_json=panoptic_json,
        sem_seg_root=sem_seg_root,
        json_file=instances_json,  # TODO rename
        evaluator_type="coco_panoptic_seg",
        label_divisor=1000,
        **metadata,
    )

At the moment, I am only trying to check the semantic segmentation results by enabling the config file

TEST:
      SEMANTIC_ON: True
      INSTANCE_ON: False
      PANOPTIC_ON: False
      OVERLAP_THRESHOLD: 0.8
      OBJECT_MASK_THRESHOLD: 0.8

I am really struggling to stage the experiment such that all three classes are detected. I would really appreciate if you could find any inconsistency in the above dataset structure or class names. I tried different variations but the predictions always seem to mix up the background/soil class with one of the either crop or weed. It seems trivial but have been scratching my head for many hours over this. Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions