ImprovedGS+: A High-Performance C++/CUDA Re-Implementation Strategy for 3D Gaussian Splatting
2026-03-09 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors present ImprovedGS+, a faster and more efficient way to recreate 3D scenes using Gaussian Splatting. They rewrote parts of their system from Python to faster C++/CUDA code, which cuts down training time and speeds up computations. Their new method uses specialized techniques like Long-Axis-Split kernels and adaptive scheduling to improve both speed and image quality. Tests show ImprovedGS+ uses fewer resources while producing better or comparable results than previous methods on a standard dataset.
3D Gaussian SplattingCUDA kernelsLaplacian-based importanceNon-Maximum SuppressionMip-NeRF360 datasetPSNRParametric complexityHost-device synchronizationExponential Scale Scheduler
Authors
Jordi Muñoz Vicente
Abstract
Recent advancements in 3D Gaussian Splatting (3DGS) have shifted the focus toward balancing reconstruction fidelity with computational efficiency. In this work, we propose ImprovedGS+, a high-performance, low-level reinvention of the ImprovedGS strategy, implemented natively within the LichtFeld-Studio framework. By transitioning from high-level Python logic to hardware-optimized C++/CUDA kernels, we achieve a significant reduction in host-device synchronization and training latency. Our implementation introduces a Long-Axis-Split (LAS) CUDA kernel, custom Laplacian-based importance kernels with Non-Maximum Suppression (NMS) for edge scores, and an adaptive Exponential Scale Scheduler. Experimental results on the Mip-NeRF360 dataset demonstrate that ImprovedGS+ establishes a new Pareto-optimal front for scene reconstruction. Our 1M-budget variant outperforms the state-of-the-art MCMC baseline by achieving a 26.8% reduction in training time (saving 17 minutes per session) and utilizing 13.3% fewer Gaussians while maintaining superior visual quality. Furthermore, our full variant demonstrates a 1.28 dB PSNR increase over the ADC baseline with a 38.4% reduction in parametric complexity. These results validate ImprovedGS+ as a scalable, high-speed solution that upholds the core pillars of Speed, Quality, and Usability within the LichtFeld-Studio ecosystem.