BinaryMaskDatasetAdapter

class cocohelper.adapters.binary_mask_adapter.BinaryMaskDatasetAdapter[source]

Bases: DatasetAdapter

A DatasetAdapter to convert dataset with binary masks to COCO format.

Parameters:
  • data_paths – A dictionary that maps image filenames to its masks filenames.

  • image_loader – A function to load an image.

  • mask_loader – A function to load a mask.

  • categories – A dictionary that maps mask value to a category.

  • mode – How to encode the mask, defaults to polygon.

  • compression_factor – Compression factor the encoded mask.

Method List

extract_bbox_from_binary_mask(binary_mask)

Extracts bounding box from segmentation mask.

get_categories()

Get the list of categories.

get_individual_instances(mask, mode, ...)

Separates disjoint objects inside the same array.

get_sample(idx)

Get the COCO representation for a specific sample and its index.

read_image(idx)

Reads an image in from its positional id in the data paths.

Attributes List

_abc_impl

Methods Details

static extract_bbox_from_binary_mask(binary_mask)[source]

Extracts bounding box from segmentation mask.

NB: we do not support rotated bounding boxes.

Parameters:

binary_mask (ndarray) – binary semantic mask of an object.

Returns:

A bounding box surrounding the input semantic mask.

Return type:

List

get_categories()[source]

Get the list of categories.

Returns:

A list of categories

Return type:

List[dict]

get_individual_instances(mask, mode, compression_factor, **kwargs)[source]

Separates disjoint objects inside the same array.

Objects are separated based on two rules:
  1. objects have different labels in the input mask (e.g. one is

associated with 1, the other with 2);

  1. objects that have the same label are disjoint in space, with

structuring element as in scipy.ndimage.label.

Parameters:
  • mask (ndarray) – numpy array containing the semantic masks.

  • mode (str) – encoding mode for the semantic mask. Can be ‘RLE’, ‘cRLE’, or ‘polygon’.

  • compression_factor (Union[float, Tuple[float, float]]) – compression factor of the semantic map before conversion to COCO format. Use a factor > 1 to compress the segmentation mask s.t. its encoding does not occupy too much memory. The compression consists in a down-sampling of the mask array to a lower resolution before the subsequent conversion to COCO format.

  • **kwargs – optional keyword parameters for the encoding.

Returns:

Segmentations, bounding boxes, and categories contained in the input array.

Return type:

Tuple[List, List, List]

get_sample(idx)[source]

Get the COCO representation for a specific sample and its index.

Parameters:

idx (int) – sample index.

Returns:

The values of image, and image_annotations in COCO format.

Return type:

Optional[Tuple[dict, List[dict]]]

read_image(idx)[source]

Reads an image in from its positional id in the data paths.

Parameters:

idx (int) – image index.

Returns:

An image array corresponding to the given index in the data paths.

Return type:

ndarray

Attribute Details

_abc_impl = <_abc._abc_data object>