📢 INFO: This repository will be fully published upon the official publication announcement.
StixelNExT++ introduces a novel approach to monocular scene representation, extending the Stixel World paradigm. Our method infers 3D Stixels and enhances object segmentation by clustering smaller Stixel units, creating a highly compressed yet flexible representation for point clouds and bird’s-eye-view (BEV) maps.
✅ Real-time performance – Achieves inference speeds as fast as 10 ms per frame.
✅ Lightweight neural network – Trained with automatically generated LiDAR-based ground truth - holistic and
specific.
✅ Strong performance – Evaluated on the Waymo dataset, delivering competitive results within a **30-meter range
**.
✅ Versatile applications – Ideal for collective perception in autonomous systems.
Our model was evaluated on the Waymo dataset, focusing on vehicles and pedestrians. Image features were extracted and projected into metric space using camera intrinsics. Stixels are represented by their start and end points in Cartesian space.
By clustering Stixels, we achieve competitive object detection performance.
We also trained models on the KITTI dataset, expanding our approach to capture all possible obstacles, including buildings, vegetation, and environmental structures.
More updates coming soon! 🚀