Riverine Land Cover Mapping through Semantic Segmentation of Multispectral Point Clouds

2026-03-23Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors tested a new deep learning model called Point Transformer v2 to classify parts of riversides (like sand, water, and plants) using special 3D LiDAR data that also includes color information. They trained the model with data from a river in Finland and found it worked much better when using both shape and color details compared to just shape alone. They also tried combining data from another river to help the model do better when faced with new places. Their results show this method can help with river environment monitoring and management by accurately identifying different land coverings.

Point Transformer v2LiDARsemantic segmentationmultispectral datariverine environmentsmean Intersection over Union (mIoU)land cover mappingdeep neural networkgeometry featuresspectral features
Authors
Sopitta Thurachen, Josef Taher, Matti Lehtomäki, Leena Matikainen, Linnea Blåfield, Mikel Calle Navarro, Antero Kukko, Tomi Westerlund, Harri Kaartinen
Abstract
Accurate land cover mapping in riverine environments is essential for effective river management, ecological understanding, and geomorphic change monitoring. This study explores the use of Point Transformer v2 (PTv2), an advanced deep neural network architecture designed for point cloud data, for land cover mapping through semantic segmentation of multispectral LiDAR data in real-world riverine environments. We utilize the geometric and spectral information from the 3-channel LiDAR point cloud to map land cover classes, including sand, gravel, low vegetation, high vegetation, forest floor, and water. The PTv2 model was trained and evaluated on point cloud data from the Oulanka river in northern Finland using both geometry and spectral features. To improve the model's generalization in new riverine environments, we additionally investigate multi-dataset training that adds sparsely annotated data from an additional river dataset. Results demonstrated that using the full-feature configuration resulted in performance with a mean Intersection over Union (mIoU) of 0.950, significantly outperforming the geometry baseline. Other ablation studies revealed that intensity and reflectance features were the key for accurate land cover mapping. The multi-dataset training experiment showed improved generalization performance, suggesting potential for developing more robust models despite limited high-quality annotated data. Our work demonstrates the potential of applying transformer-based architectures to multispectral point clouds in riverine environments. The approach offers new capabilities for monitoring sediment transport and other river management applications.