Learning the Geometry of Data: A Mathematical Review of Shape Space Analysis
2026-06-15 • Machine Learning
Machine Learning
AI summaryⓘ
The authors review how machine learning can understand complex shapes found in many fields like biology and medicine. They explain that traditional methods often struggle because shape data has complicated, nonlinear geometry. The survey organizes research into steps like representing shapes, measuring differences between them, analyzing their variation statistically, and using geometry-aware learning techniques. They highlight how these tools help compare shapes and study changes over time, with examples like cell shapes and primate teeth. Finally, the authors point out ongoing challenges and new chances as shape datasets grow larger and more complex.
machine learningshape spacedifferential geometrygeodesic metricsshape representationstatistical shape analysisnonlinear geometrygeometry-aware learningmorphologygeometric data
Authors
Gary P. T. Choi, Khanh Dao Duc, Shira Faigenbaum-Golovin, Karen Habermann, Emmanuel Hartman, Christoph von Tycowicz, Chi Zhang, Wenjun Zhao, Felix Zhou
Abstract
A central objective of machine learning is to identify structure and patterns in data. Advances in data acquisition have increasingly produced datasets whose observations possess rich geometric form, giving rise to shape spaces that encode variability in object geometry. Such datasets arise across a wide range of disciplines, including biology, medicine, anthropology, and computer vision, where subtle geometric differences often carry important scientific information. Traditional machine learning methods, however, are frequently ill-equipped to account for the nonlinear geometric structure underlying these data. This survey synthesizes a rapidly growing body of work on shape space analysis, which provides a mathematical and computational framework for the study of geometric data. Drawing on ideas from differential geometry, statistics, and machine learning, we organize the literature around a common analytical pipeline: shape representation and parameterization, the rigorous construction of robust geodesic metrics, statistical analysis on shape spaces, and geometry-aware learning methods. We discuss how these tools enable the characterization of shape variability, the comparison of geometric objects, and the analysis of structural trajectories across populations and time. To illustrate the breadth of the field, we highlight applications spanning multiple scales of biological organization, including studies of subcellular morphology and primate tooth evolution. Across these and many other domains, researchers face common challenges arising from complex, nonlinear, and often unaligned geometric variation. The review concludes by identifying key theoretical and computational challenges, as well as emerging opportunities driven by increasingly large and diverse geometric datasets.