MSU Technologies

Camera-LiDAR Fusion for 3D Object Detection

VALUE PROPOSITION

When compared to single modality approaches, fusion based object detection methods often require more complex models to integrate heterogeneous sensor data and use more GPU memory and computational resources. This is particularly true for camera-LiDAR based multimodal fusion, which may require three separate deep-learning networks and/or processing pipelines that are designated for the visual data, LiDAR data, and for some form of a fusion framework. A Fast Camera-LiDAR Object Candidates (Fast-CLOCs) fusion network has been developed that can run high-accuracy fusion-based 3D object detection in near real-time.

DESCRIPTION OF TECHNOLOGY

Fast-CLOCs operates on the output candidates before Non-Maximum Suppression (NMS) of any 3D detector, and adds a lightweight 3D detector-cued 2D image detector (3D-Q-2D) to extract visual features from the image domain to improve 3D detections significantly. The 3D detection candidates are shared with the proposed 3DQ-2D image detector as proposals to reduce the network complexity drastically. The superior results of Fast-CLOCs on the challenging KITTI and nuScenes datasets illustrate that Fast-CLOCs outperform state-of-the-art fusion-based 3D object detection approaches.

BENEFITS