fix_fk_after_drop_duplicate
- cocohelper.utils.dataframe.fix_fk_after_drop_duplicate(connected_df, fk_column, merge_index_mapping)[source]
Fix the foreign key of a dataframe connected to a dataframe with dropped duplicates.
The foreign keys of connected_df that where pointing to indices that have been merged together should now point to the only instance of the duplicates that has been kept by the drop duplicate method.
- Parameters:
connected_df (DataFrame) – dataframe connected to a dataframe for which duplicates have been removed.
fk_column (str) – the column of connected_df that contains the foreign key that should be fixed.
merge_index_mapping (Dict) – a dict that maps the dropped keys to the key of the not-dropped duplicate row, e.g. if we merged rows with index (0, 1, 2) keeping only 0, and we merged rows with index (3, 4, 5) keeping only 3, this map should be: {1: 0, 2: 0, 4: 3, 5: 3}.
- Returns:
A copy of connected_df with fixed foreign key (values of fk_columns).
- Return type:
DataFrame