Table of Contents
In robotics and autonomous systems, sensor fusion is a crucial technique that combines data from multiple sensors to enhance perception and decision-making. Each sensor has its strengths and limitations—for instance, cameras provide rich color and texture information but struggle in low-light conditions, while LiDAR offers precise depth measurements but lacks the ability to detect detailed visual features. By fusing data from both sensors, we can create a more robust and reliable perception system.
In this blog, we focus on two key applications of sensor fusion:
Marker Recognition – Identifying road markers that serve as key reference points for autonomous navigation.
Lane Generation – Using detected markers to construct reliable lane boundaries, enabling autonomous robots and vehicles to navigate efficiently.
A major challenge in this work is the detection of custom-made markers composed of glass beads. These markers are small in size, making them difficult to detect using conventional vision or LiDAR-based methods.
Relying on a single sensor for these tasks presents several challenges. A camera alone can struggle with varying lighting conditions, shadows, and occlusions, making marker detection unreliable. On the other hand, LiDAR alone provides accurate distance measurements but may fail to detect painted or printed markers on the road due to its limited resolution and inability to capture color information.
To overcome these limitations, this blog explores the fusion of camera and LiDAR data for marker recognition and lane generation. Specifically, we will test two setups:
Action 4 Camera and AutoL LiDAR
HiK Camera and Hesai LiDAR
Both setups involve placing markers on the left and right sides of the road and integrating multi-sensor data to improve detection accuracy. By fusing depth and visual information, we aim to overcome the challenges posed by small-sized markers, generate precise lane boundaries, and enhance the overall perception system for autonomous navigation.
In the following sections, we will dive into the details of each sensor setup, the fusion methodology, and the results obtained from our experiments.
For this setup, we utilize an Action 4 camera and two solid-state AutoL LiDARs to achieve robust marker recognition and lane generation. Each sensor is strategically positioned to maximize coverage and improve detection accuracy.
2.1.1. Camera Setup
The Action 4 camera is securely mounted inside the vehicle, positioned centrally and facing forward through the windshield. This placement ensures that the camera captures a clear, unobstructed view of the road ahead, including the small, custom-made glass bead markers placed along the lane boundaries. Additionally, mounting the camera inside the vehicle protects it from external factors such as dust, rain, and sunlight glare, ensuring consistent performance in varying environmental conditions.
2.1.2. LiDAR Placement
Two solid-state AutoL LiDARs are installed at fixed positions near the front headlights, one on each side of the vehicle. This placement provides optimal coverage of the left and right sides of the road, ensuring that small, custom-made glass bead markers can be detected even in challenging conditions.
2.1.3 Why This Placement?
Action 4 Camera (Forward-Facing, Inside the Vehicle):
Captures visual information of the road, including marker color, shape, and texture.
Protected from weather conditions, reducing interference and sensor degradation.
AutoL LiDARs (Near Front Headlights, Left & Right Fixed Positions):
Provide precise depth information to detect markers and lane boundaries.
Ensure complete left and right coverage, reducing blind spots.
Improve marker detection accuracy, especially in low-light or high-glare conditions where cameras struggle.
This strategic sensor placement enables effective fusion of camera and LiDAR data, improving the overall accuracy and reliability of marker recognition and lane generation.
For this sensor fusion setup, we use the HiK MV-CA023-10GC camera and the Hesai OT128 LiDAR, chosen for their enhanced performance and complementary in data collection.
HiK Camera
The HiK MV-CA023-10GC camera is mounted on top of the vehicle, strategically positioned to capture a wide, unobstructed view of the road ahead, including lane boundaries and markers. This elevated placement allows for an extensive field of view, optimizing the detection of small, custom-made glass bead markers and lane features from a broader angle. By positioning the camera on top, we reduce the impact of occlusions or other road obstructions that could limit visibility if placed lower on the vehicle. Additionally, the camera is enclosed in a protective case, ensuring it remains shielded from environmental elements such as dust, rain, and sunlight, which guarantees reliable and consistent performance in various weather conditions.
Hesai LiDAR
The Hesai OT128 LiDAR, a high-performance 360° solid-state LiDAR, is also mounted on top of the vehicle, directly aligned with the camera. This setup ensures that both the camera and LiDAR share a similar field of view, providing consistent and synchronized data for fusion. The OT128 LiDAR offers precise depth measurements, critical for detecting the small markers and accurately mapping the road lane boundaries. Its 360° coverage ensures that the vehicle has continuous environmental awareness, even if certain parts of the road are temporarily occluded by the vehicle itself.
HiK Camera (Top of the Vehicle)
Wide Field of View: Captures an expansive view of the road for accurate marker and lane detection.
Minimized Occlusions: Reduces interference from road signs, vehicles, or barriers for improved visibility.
Environmental Protection: Shielded from weather and sunlight, ensuring reliable data capture in varying conditions.
Hesai OT128 LiDAR (Top of the Vehicle)
360° Coverage: Provides continuous environmental awareness for lane boundary detection.
Synchronized Field of View: Ensures alignment with the camera for accurate data fusion.
Reduced Interference: Avoids obstruction by the vehicle body, ensuring clearer depth perception in low-visibility conditions.
This strategic positioning ensures that both sensors capture synchronized, unobstructed data, improving the system's overall performance
To address the challenges in marker recognition and lane generation, we chose to implement early fusion.
Why We Chose Early Fusion
The early fusion approach was selected because it integrates raw sensor data at the beginning of the process, optimizing both memory usage and real-time processing. This integration allowed for efficient use of computational resources without compromising the performance of marker recognition or lane generation.
Therefore, it was more effective in our scenario due to the following key reasons.
Reduced Latency for Real-Time Decisions
Early fusion integrates raw sensor data immediately, minimizing delays and enabling faster decision-making, which is crucial for real-time tasks like autonomous navigation.
Optimized Memory Usage
By directly combining raw data, early fusion avoids storing intermediate results, reducing memory consumption and improving overall system efficiency—especially in memory-constrained environments.
Enhanced Sensor Synergy
Early fusion allows immediate integration of complementary sensor data (e.g., camera and LiDAR), improving accuracy and reliability, especially in challenging conditions like small marker detection.
Better Adaptability in Dynamic Environments
In rapidly changing environments, early fusion enhances system adaptability by quickly integrating sensor data, enabling faster response to environmental changes.
Improved Sensor Data Alignment
Early fusion ensures sensor data is aligned from the start, avoiding misalignment issues and maintaining high accuracy, particularly for tasks like detecting small markers where precision is critical.
In contrast, late fusion processed the sensor data independently before combining the results. While late fusion was effective in some scenarios, it proved to be less suited to our task for the following reasons:
Increased Latency and Delayed Decisions
Late fusion processes sensor data independently first, which introduces delays and increases latency, making it less suitable for real-time decision-making in autonomous systems.
Higher Memory Consumption
Since late fusion stores intermediate outputs from each sensor before fusion, it requires more memory, which can become a limitation in systems with constrained resources.
Missed Opportunities for Sensor Complementation
Late fusion processes data from sensors separately, which reduces the system's ability to leverage the complementary strengths of both sensors, leading to less accurate marker detection and missed synergies.
Slower Adaptation in Dynamic Environments
The delay caused by processing sensor data independently before fusion makes late fusion less effective in dynamic environments where quick adaptation to changing conditions is essential.
Potential Data Misalignment
Since late fusion combines processed data after independent sensor processing, it increases the likelihood of misalignment, which can result in inaccuracies, particularly when precise alignment is critical for tasks like marker detection.
So, adopting late fusion could result in increased delays and higher memory usage, making it less efficient for detecting small markers. Therefore, early fusion was chosen as the optimal approach for in our scenario.
For both Action 4 Camera + AutoL LiDAR and HiK Camera + Hesai LiDAR setups, early fusion follows these key steps:
Sensor Calibration & Synchronization
Both the camera and LiDAR are calibrated to establish their intrinsic and extrinsic parameters.
Time synchronization ensures that data from both sensors corresponds to the same moment in time.
Data Acquisition
Capture 2D image from camera(s) and 3D pointcloud data from the LiDAR(s).
Data Alignment (Projection of LiDAR onto Camera Image Plane)
LiDAR point cloud data is transformed into the camera's coordinate system.
This step involves extrinsic transformation using rotation and translation matrices.
The LiDAR depth information is projected onto the camera image, creating a fused representation.
Fusion at the Raw Data Level
Instead of processing image and point cloud data separately, they are merged at the pixel level.
Depth information from the LiDAR is added to the camera image, improving object and marker detection accuracy.
Marker Recognition & Lane Generation
The fused data is used to detect small, custom-made glass bead markers.
Lane boundaries are extracted by leveraging visual (camera) and depth (LiDAR) information, making the system robust in varying lighting conditions.
Improved Marker Recognition: The fusion of depth and visual features enhances the detection of small road markers.
Robust Lane Generation: Combines the strengths of both sensors to create accurate lane boundaries.
Better Handling of Challenging Conditions: Works well in low-light or high-glare situations where a camera alone would struggle.
Enhanced Data Completeness: Merging raw sensor data ensures a richer and more informative dataset for downstream processing.
Yongin-si, South Korea
sumairamanzoorpk@gmail.com