COCOValidator

class cocohelper.validator.COCOValidator[source]

Bases: object

This class validates COCO datasets.

Method List

_annotation_ids_are_unique(json_data)

Check if there are duplicated annotations ids.

_annotations_have_mandatory_keys(json_data)

Check if annotations have the mandatory keys.

_annotations_have_valid_category_id(json_data)

Check that annotations have a valid category id.

_annotations_have_valid_image_id(json_data)

Check that annotations have a valid image id.

_categories_have_mandatory_keys(json_data[, ...])

Check if categories have the mandatory keys.

_category_ids_are_unique(json_data)

Check if there are duplicated category ids.

_has_valid_dataset_tree(dataset_dir)

Check dataset directory tree validity

_image_ids_are_unique(json_data)

Check if there are duplicated image ids.

_images_have_mandatory_keys(json_data[, ...])

Check if images have the mandatory keys.

_json_has_mandatory_keys(json_data[, ...])

Check if the input COCO annotation json file has the mandatory keys.

_licenses_ids_are_unique(json_data)

Check if there are duplicated licenses ids.

validate_dataset()

Check if this is a valid COCO dataset.

validate_dir([json_fname])

Checks the COCO dataset validity based on the dir structure.

Methods Details

static _annotation_ids_are_unique(json_data)[source]

Check if there are duplicated annotations ids.

Parameters:

json_data (Dict) – data from a coco json file.

Returns:

True if the annotation ids are unique, False otherwise.

Return type:

bool

static _annotations_have_mandatory_keys(json_data, mandatory_keys=None)[source]

Check if annotations have the mandatory keys.

Parameters:
  • json_data (Dict) – data from a coco json file.

  • mandatory_keys (Optional[List]) – mandatory keys for a valid COCO json file.

Returns:

True if the dictionary has mandatory keys, False otherwise.

Return type:

bool

static _annotations_have_valid_category_id(json_data)[source]

Check that annotations have a valid category id.

Parameters:

json_data (Dict) – data from a coco json file.

Return type:

bool

static _annotations_have_valid_image_id(json_data)[source]

Check that annotations have a valid image id.

Parameters:

json_data (Dict) – data from a coco json file.

Return type:

bool

static _categories_have_mandatory_keys(json_data, mandatory_keys=None, recommended_keys=None)[source]

Check if categories have the mandatory keys.

Parameters:
  • json_data (Dict) – data from a coco json file.

  • mandatory_keys (Optional[List]) – mandatory keys for a valid COCO json file. If missing, validation fails.

  • recommended_keys (Optional[List]) – recommended keys for a valid COCO json file. If missing, generates a warning. The check on these keys is run only after checking the mandatory keys.

Returns:

True if the dictionary has mandatory keys, False otherwise.

Return type:

bool

static _category_ids_are_unique(json_data)[source]

Check if there are duplicated category ids.

Parameters:

json_data (Dict) – data from a coco json file.

Return type:

bool

_has_valid_dataset_tree(dataset_dir)[source]

Check dataset directory tree validity

Returns:

True if the dataset tree is valid, False otherwise

Parameters:

dataset_dir (Union[str, Path]) –

Return type:

bool

static _image_ids_are_unique(json_data)[source]

Check if there are duplicated image ids.

Parameters:

json_data (Dict) – data from a coco json file.

Return type:

bool

static _images_have_mandatory_keys(json_data, mandatory_keys=None)[source]

Check if images have the mandatory keys.

Parameters:
  • json_data (Dict) – data from a coco json file.

  • mandatory_keys (Optional[List]) – mandatory keys for a valid COCO json file.

Returns:

True if the dictionary has mandatory keys, False otherwise.

Return type:

bool

static _json_has_mandatory_keys(json_data, mandatory_keys=None)[source]

Check if the input COCO annotation json file has the mandatory keys.

Correct COCO annotation json files must have the following keys: “images”, “annotations”, “categories”. There are some exceptions: for example, the “categories” field does not exist for caption annotations. In these cases, you can explicitly feed the mandatory_keys to the function.

Parameters:
  • json_data (Dict) – data from a coco json file.

  • mandatory_keys (Optional[List]) – mandatory keys for a valid COCO json file.

Returns:

True if the json file has mandatory keys, False otherwise.

Return type:

bool

static _licenses_ids_are_unique(json_data)[source]

Check if there are duplicated licenses ids.

Parameters:

json_data (Dict) – data from a coco json file.

Return type:

bool

validate_dataset()[source]

Check if this is a valid COCO dataset.

Returns:

True if this is a valida dataset.

Return type:

Tuple[bool, dict]

validate_dir(json_fname='coco.json')[source]

Checks the COCO dataset validity based on the dir structure.

Parameters:

json_fname (str) – the name of the json file containing the COCO dataset.

Returns:

True if this is a valid COCO dataset.

Return type:

bool