COCOHelper
- class cocohelper.helper.COCOHelper[source]
Bases:
object
Represent a dataset in the COCO format.
To create an instance of COCOHelper is advisable to use the load methods.
- Parameters:
img_df – DataFrame of images.
ann_df – DataFrame of annotations.
cat_df – DataFrame of categories, optional.
lic_df – DataFrame of licenses, optional.
info – Info dict, optional.
coco_dir – Root directory of the dataset, optional.
paths – COCOHelperPaths, used to customize directory structure.
validate – If True, validate the COCO dataset and raise an error if invalid.
- Raises:
COCOValidationError if the input COCO dataset is not valid. This –
check is performed only if validate=True. –
Method List
_copy_images
(target_img_dir)_read_annotations_file
(annotation_file)Read a COCO json file as a dict.
Remove annotations that have non-existing image or categories ids.
Validate the COCO dataset and raise an error if invalid.
copy
([cat_df, img_df, ann_df, lic_df, info, ...])Copy the dataset and optionally change some dataframes.
Drop duplicate annotations (same values with different index).
Drop duplicate categories (same values with different index).
Drop duplicate images (same values with different index).
Drop duplicate licenses (same values with different index).
Get a new COCOHelper dataset that does only contain unlabelled images.
Get a new COCOHelper dataset that does not contain unlabelled images.
filter
(cfilter, *[, ann_ids, img_ids, ...])Get a copy of the dataset with the applied filters.
filter_anns
([cfilter, ann_ids, img_ids, ...])Get a copy of the dataset with filtered annotations.
filter_cats
([cfilter, cat_ids, cat_nms, ...])Get a copy of the dataset with filtered categories.
filter_imgs
([cfilter, img_ids, img_nms, ...])Get a copy of the dataset with filtered images.
filtered_anns
([cfilter, ann_ids, img_ids, ...])Get dataset's annotations after join with categories and images, and potentially after a filtering.
filtered_cats
([cfilter, cat_ids, cat_nms, ...])Get dataset's categories, potentially filtered by the provided filters.
filtered_imgs
([cfilter, img_ids, img_nms, ...])Get dataset's images, after join with annotations and categories and potentially filtered by a filter.
get_ann_sample
([ann_id, idx, transform])Load a single annotation with the corresponding image.
get_img
(img_id)Load the image with img_id as a numpy array.
get_img_sample
([img_id, idx, transform])Load an image with infos and annotations.
load
(coco_dir[, ann_fname, ann_dir, ...])Create a COCOHelper from a COCO dataset stored in a directory.
load_data
(annotations, coco_dir[, ...])load_json
(json_annotations_file[, img_dir, ...])Create COCOHelper from json annotation file of the COCO dataset stored in a directory.
merge
(*coco_helper[, drop_duplicates])Merge different COCO datasets with all categories, images, annotations and licenses merged.
Get a generic info dict for COCO format.
save
(coco_dir[, fix_img_path, copy_images])Save the current COCOHelper to a directory.
to_coco
()Convert COCOHelper to pycocotools.COCO
Convert the current COCOHelper to a dict with the same structure of the COCO json file.
write_annotations_file
(annotation_file_path)Save the current COCOHelper as a COCO json file.
Attributes List
Dataframe containing the annotations data of the COCO dataset.
Dataframe containing the categories data of the COCO dataset.
Dataframe containing the images metadata of the COCO dataset.
Dataframe containing extra information of the COCO dataset.
Get a COCOJoins object, that enable easy access to different joins dataset tables.
Get only the labelled images as a DataFrame.
Dataframe containing the licenses of the COCO dataset.
Information about folder and file organization for a COCO dataset.
Path to the root directory containing the COCO dataset.
Get only the unlabelled images as a DataFrame.
Get a COCOValidator object, that enable easy access to different validation methods.
Methods Details
- classmethod _read_annotations_file(annotation_file)[source]
Read a COCO json file as a dict.
- Parameters:
annotation_file (str) –
- Return type:
dict
- copy(cat_df=None, img_df=None, ann_df=None, lic_df=None, info=None, validate=False)[source]
Copy the dataset and optionally change some dataframes.
When changing categories or images, annotations that result as invalid will be removed.
- Parameters:
cat_df (Optional[DataFrame]) – New category dataframe, optional
img_df (Optional[DataFrame]) – New image dataframe, optional
ann_df (Optional[DataFrame]) – New annotation dataframe, optional
lic_df (Optional[DataFrame]) – New license dataframe, optional
info (Optional[dict]) – New info dict, optional
validate (bool) – If True, validate the COCO dataset and raise an error if invalid
- Returns:
A new COCOHelper object.
- Return type:
- drop_labelled()[source]
Get a new COCOHelper dataset that does only contain unlabelled images.
- Returns:
A new COCOHelper object containing only unlabelled images.
- Return type:
- drop_unlabelled()[source]
Get a new COCOHelper dataset that does not contain unlabelled images.
- Returns:
A new COCOHelper object containing only labelled images.
- Return type:
- filter(cfilter, *, ann_ids=None, img_ids=None, img_nms=None, cat_ids=None, cat_nms=None, supercat_nms=None, area_rng=None, is_crowd=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False, drop_orphans=True)[source]
Get a copy of the dataset with the applied filters.
- Parameters:
cfilter (Filter) – a custom Filter for the COCOHelper.
ann_ids (Optional[Union[Sequence[int], int]]) – a filter for the annotation ids.
img_ids (Optional[Union[Sequence[int], int]]) – a filter for the image ids.
img_nms (Optional[Union[Sequence[str], str]]) – a filter for the image file names.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
area_rng (Optional[Tuple[float, float]]) – a filter for the annotation area.
is_crowd (Optional[bool]) – a filter for the annotation being a crowd or not (“is_crowd” in the annotation of the COCO JSON file).
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
drop_orphans (bool) – if True, drop orphans when applying the filter.
- Returns:
A COCOHelper with data filtered according to the given filters.
- Return type:
- filter_anns(cfilter=None, *, ann_ids=None, img_ids=None, img_nms=None, cat_ids=None, cat_nms=None, supercat_nms=None, area_rng=None, is_crowd=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False)[source]
Get a copy of the dataset with filtered annotations.
- Parameters:
cfilter (Optional[Filter]) – a custom Filter for the COCOHelper.
ann_ids (Optional[Union[Sequence[int], int]]) – a filter for the annotation ids.
img_ids (Optional[Union[Sequence[int], int]]) – a filter for the image ids.
img_nms (Optional[Union[Sequence[str], str]]) – a filter for the image file names.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
area_rng (Optional[Tuple[float, float]]) – a filter for the annotation area.
is_crowd (Optional[bool]) – a filter for the annotation being a crowd or not (“is_crowd” in the annotation of the COCO JSON file).
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
- Returns:
A COCOHelper with data filtered according to the given filters.
- Return type:
- filter_cats(cfilter=None, *, cat_ids=None, cat_nms=None, supercat_nms=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False)[source]
Get a copy of the dataset with filtered categories.
- Parameters:
cfilter (Optional[Filter]) – a custom Filter for the COCOHelper.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
- Returns:
A COCOHelper with data filtered according to the given filters.
- Return type:
- filter_imgs(cfilter=None, *, img_ids=None, img_nms=None, cat_ids=None, cat_nms=None, supercat_nms=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False)[source]
Get a copy of the dataset with filtered images.
- Parameters:
cfilter (Optional[Filter]) – a custom Filter for the COCOHelper.
img_ids (Optional[Union[Sequence[int], int]]) – a filter for the image ids.
img_nms (Optional[Union[Sequence[str], str]]) – a filter for the image file names.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
- Returns:
A COCOHelper with data filtered according to the given filters.
- Return type:
- filtered_anns(cfilter=None, *, ann_ids=None, img_ids=None, img_nms=None, cat_ids=None, cat_nms=None, supercat_nms=None, area_rng=None, is_crowd=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False)[source]
Get dataset’s annotations after join with categories and images, and potentially after a filtering.
- Parameters:
cfilter (Optional[Filter]) – a custom Filter for the COCOHelper.
ann_ids (Optional[Union[Sequence[int], int]]) – a filter for the annotation ids.
img_ids (Optional[Union[Sequence[int], int]]) – a filter for the image ids.
img_nms (Optional[Union[Sequence[str], str]]) – a filter for the image file names.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
area_rng (Optional[Tuple[float, float]]) – a filter for the annotation area.
is_crowd (Optional[bool]) – a filter for the annotation being a crowd or not (“is_crowd” in the annotation of the COCO JSON file).
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
- Returns:
A pandas.DataFrame containing the filtered annotations.
- Return type:
DataFrame
- filtered_cats(cfilter=None, *, cat_ids=None, cat_nms=None, supercat_nms=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False)[source]
Get dataset’s categories, potentially filtered by the provided filters.
- Parameters:
cfilter (Optional[Filter]) – a custom Filter for the COCOHelper.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
- Returns:
A pandas.DataFrame containing the filtered categories.
- Return type:
DataFrame
- filtered_imgs(cfilter=None, *, img_ids=None, img_nms=None, cat_ids=None, cat_nms=None, supercat_nms=None, composition=<class 'cocohelper.filters.filter.AndFilter'>, invert=False)[source]
Get dataset’s images, after join with annotations and categories and potentially filtered by a filter.
- Parameters:
cfilter (Optional[Filter]) – a custom Filter for the COCOHelper.
img_ids (Optional[Union[Sequence[int], int]]) – a filter for the image ids.
img_nms (Optional[Union[Sequence[str], str]]) – a filter for the image file names.
cat_ids (Optional[Union[Sequence[int], int]]) – a filter for the category ids.
cat_nms (Optional[Union[Sequence[str], str]]) – a filter for the category names.
supercat_nms (Optional[Union[Sequence[str], str]]) – a filter for the super-category names.
composition (Type[ComposeFilter]) – a composition type for the filters (defaults to “and” behavior between each filter).
invert (bool) – if True, invert the way the filter works.
- Returns:
A pandas.DataFrame containing the filtered images.
- Return type:
DataFrame
- get_ann_sample(ann_id=None, idx=None, transform=None)[source]
Load a single annotation with the corresponding image.
- Parameters:
ann_id (Optional[int]) – The id of annotation to load, partially optional (if not provided, idx must be provided).
idx (Optional[int]) – The index of annotation to load, partially optional (if not provided, ann_id must be provided).
transform (Optional[Transform]) – An optional Transform to modify the image and annotation.
- Returns:
The image as a numpy array and the annotation infos as a dict.
- Return type:
Tuple[np.ndarray, dict]
- get_img(img_id)[source]
Load the image with img_id as a numpy array.
- Parameters:
img_id (int) – The id of the image to load.
- Returns:
A numpy array with shape (H, W, C).
- Return type:
ndarray
- get_img_sample(img_id=None, idx=None, transform=None)[source]
Load an image with infos and annotations.
- Parameters:
img_id (Optional[int]) – The id of the image to load, partially optional (if not provided, idx must be provided).
idx (Optional[int]) – The index of the image to load, partially optional (if not provided, img_id must be provided).
transform (Optional[Transform]) – An optional Transform to modify the image and annotations.
- Returns:
A dictionary with image infos and data, and a list of annotations.
- Return type:
Tuple[dict, list]
- classmethod load(coco_dir, ann_fname='coco.json', ann_dir='annotations/', img_dir='images/', validate=False)[source]
Create a COCOHelper from a COCO dataset stored in a directory.
- Parameters:
coco_dir (str) – path to the directory containing the dataset.
ann_fname (str) – name of the annotation file to be load.
ann_dir (str) – name/relative-path to the directory where annotations are stored.
img_dir (str) – name/relative-path to the directory where images are stored.
validate (bool) – If True, validate the dataset.
- Returns:
A COCOHelper object.
- Return type:
- classmethod load_data(annotations, coco_dir, ann_fname='coco.json', ann_dir='annotations/', img_dir='images/', validate=False)[source]
- Parameters:
annotations (Dict[str, DataFrame]) –
coco_dir (str) –
ann_fname (str) –
ann_dir (str) –
img_dir (str) –
validate (bool) –
- Return type:
- classmethod load_json(json_annotations_file, img_dir='images/', validate=False)[source]
Create COCOHelper from json annotation file of the COCO dataset stored in a directory.
- Parameters:
json_annotations_file (str) – path to the json file containing the dataset annotations.
img_dir (str) – name/relative-path to the directory where images are stored, respect to the coco dataset root.
validate (bool) – If True, validate the dataset.
- Returns:
A COCOHelper object.
- Return type:
- merge(*coco_helper, drop_duplicates=True)[source]
Merge different COCO datasets with all categories, images, annotations and licenses merged.
- Parameters:
*coco_helper (COCOHelper) – coco dataset(s) to merge with this coco dataset.
drop_duplicates (bool) – if True, merge duplicate rows dropping redundant.
- Returns:
A COCOHelper resulting from merging multiple datasets.
- Return type:
- save(coco_dir, fix_img_path=False, copy_images=False)[source]
Save the current COCOHelper to a directory.
- Parameters:
coco_dir (Union[str, Path]) – Output root directory.
fix_img_path (bool) – NotImplemented.
copy_images (bool) – NotImplemented.
- Returns:
None.
- Return type:
None
- to_json_dataset()[source]
Convert the current COCOHelper to a dict with the same structure of the COCO json file.
- Return type:
dict
- write_annotations_file(annotation_file_path)[source]
Save the current COCOHelper as a COCO json file.
- Parameters:
annotation_file_path (Union[str, Path]) –
Attribute Details
- anns
Dataframe containing the annotations data of the COCO dataset.
- cats
Dataframe containing the categories data of the COCO dataset.
- imgs
Dataframe containing the images metadata of the COCO dataset.
- info
Dataframe containing extra information of the COCO dataset.
- joins
Get a COCOJoins object, that enable easy access to different joins dataset tables.
- labelled_imgs
Get only the labelled images as a DataFrame.
- Returns:
A pandas.DataFrame containing the labelled images.
- licenses
Dataframe containing the licenses of the COCO dataset.
- paths
Information about folder and file organization for a COCO dataset.
- root_path
Path to the root directory containing the COCO dataset.
- unlabelled_imgs
Get only the unlabelled images as a DataFrame.
- Returns:
A pandas.DataFrame containing the unlabelled images.
- validator
Get a COCOValidator object, that enable easy access to different validation methods.