Aes3D: Aesthetic Assessment in 3D Gaussian Splatting

2026-05-06Computer Vision and Pattern Recognition

Computer Vision and Pattern RecognitionArtificial Intelligence
AI summary

The authors focus on evaluating how good 3D scenes created with a method called 3D Gaussian Splatting look, beyond just technical accuracy. They created a new dataset called Aesthetic3D with ratings about scene beauty and designed a model named Aes3DGSNet that predicts aesthetic scores directly from the 3D data without needing to render images. This makes it faster and less resource-heavy. Their system can understand higher-level art qualities like harmony and composition in 3D scenes, which previous methods missed.

3D Gaussian Splattingaesthetic assessmentneural rendering3D scene representationdataset annotationscene compositionmachine learning modelperceptual realismregressionlightweight model
Authors
Chuanzhi Xu, Boyu Wei, Haoxian Zhou, Xuanhua Yin, Zihan Deng, Haodong Chen, Qiang Qu, Weidong Cai
Abstract
As 3D Gaussian Splatting (3DGS) gains attention in immersive media and digital content creation, assessing the aesthetics of 3D scenes becomes important in helping creators build more visually compelling 3D content. However, existing evaluation methods for 3D scenes primarily emphasize reconstruction fidelity and perceptual realism, largely overlooking higher-level aesthetic attributes such as composition, harmony, and visual appeal. This limitation comes from two key challenges: (1) the absence of general 3DGS datasets with aesthetic annotations, and (2) the intrinsic nature of 3DGS as a low-level primitive representation, which makes it difficult to capture high-level aesthetic features. To address these challenges, we propose Aes3D, the first systematic framework for assessing the aesthetics of 3D neural rendering scenes. Aes3D includes Aesthetic3D, the first dataset dedicated to 3D scene aesthetic assessment, built on our proposed annotation strategy for 3D scene aesthetics. In addition, we present Aes3DGSNet, a lightweight model that directly predicts scene-level aesthetic scores from 3DGS representations. Notably, our model operates solely on 3D Gaussian primitives, eliminating the need for rendering multi-view images and thus reducing computational cost and hardware requirements. Through aesthetics-supervised learning on multi-view 3DGS scene representations, Aes3DGSNet effectively captures high-level aesthetic cues and accurately regresses aesthetic scores. Experimental results demonstrate that our approach achieves strong performance while maintaining a lightweight design, establishing a new benchmark for 3D scene aesthetic assessment. Code and datasets will be made available in a future version.