Robust and efficient post-processing for Video Object Detection (REPP)
REPP is a learning based post-processing method to improve video object detections from any object detector. REPP links detections accross frames by evaluating their similarity and refines their classification and location to suppress false positives and recover misdetections.
REPP improves video detections both for specific Image and Video Object Detectors and it supposes a light computation overhead.
Installation
REPP has been tested with Python 3.6.
Its dependencies can be found in repp_requirements.txt file.
pip install -r repp_requirements.txt
Quick usage guide
Video detections must be stored with pickle as tuples (video_name, {frame_dets}) as following:
("video_name", {"000001": [ det_1, det_2, ..., det_N ],
"000002": [ det_1, det_2, ..., det_M ]},
...)
If the stored predictions file contains detections for different videos, they must be saved as a stream of tuples with the above format.
And each detection must have the following format:
det_1: {'image_id': image_id, # Same as the used in ILSVRC if applies
'bbox': [ x_min, y_min, width, height ],
'scores': scores, # Vector of class confidence scores
'bbox_center': (x,y) } # Relative bounding box center