Benchmarking Deep Learning Models for Aerial LiDAR Point Cloud Semantic Segmentation under Real Acquisition Conditions: A Case Study in Navarre

2026-03-23Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors tested four popular deep learning models to see how well they can label 3D points from aerial LiDAR scans taken during real flights over different areas in Navarre, Spain. They compared the models on common classes like ground, vegetation, buildings, and vehicles, noting that some classes are harder to identify due to fewer examples and varied shapes. All models achieved high accuracy above 93%, but KPConv performed best overall, while Point Transformer V3 was better at recognizing vehicles, which are less common. The study also found some models worked faster but with less reliable segmentation.

3D semantic segmentationaerial LiDARdeep learning modelsKPConvRandLA-NetSuperpoint TransformerPoint Transformer V3mean IoUclass imbalancegeometric variability
Authors
Alex Salvatierra, José Antonio Sanz, Christian Gutiérrez, Mikel Galar
Abstract
Recent advances in deep learning have significantly improved 3D semantic segmentation, but most models focus on indoor or terrestrial datasets. Their behavior under real aerial acquisition conditions remains insufficiently explored, and although a few studies have addressed similar scenarios, they differ in dataset design, acquisition conditions, and model selection. To address this gap, we conduct an experimental benchmark evaluating several state-of-the-art architectures on a large-scale aerial LiDAR dataset acquired under operational flight conditions in Navarre, Spain, covering heterogeneous urban, rural, and industrial landscapes. This study compares four representative deep learning models, including KPConv, RandLA-Net, Superpoint Transformer, and Point Transformer V3, across five semantic classes commonly found in airborne surveys, such as ground, vegetation, buildings, and vehicles, highlighting the inherent challenges of class imbalance and geometric variability in aerial data. Results show that all tested models achieve high overall accuracy exceeding 93%, with KPConv attaining the highest mean IoU (78.51%) through consistent performance across classes, particularly on challenging and underrepresented categories. Point Transformer V3 demonstrates superior performance on the underrepresented vehicle class (75.11% IoU), while Superpoint Transformer and RandLA-Net trade off segmentation robustness for computational efficiency.