Interpretable Machine Learning for Antepartum Prediction of Pregnancy-Associated Thrombotic Microangiopathy Using Routine Longitudinal Laboratory Data

2026-05-13 • Machine Learning

Machine Learning

AI summaryⓘ

The authors studied a rare and dangerous pregnancy condition called pregnancy-associated thrombotic microangiopathy (P-TMA), which is hard to spot early because its signals often look like normal pregnancy changes. They used machine learning on repeated lab test data from 300 pregnancies to find patterns predicting P-TMA risk before symptoms appeared. Their best model, using a technique called gradient boosting, showed good accuracy in identifying at-risk patients. One lab marker, cystatin C at week 6 of pregnancy, appeared especially helpful for early monitoring.

Pregnancy-associated thrombotic microangiopathyMachine learningGradient boostingCystatin CLongitudinal clinical testsAUROCThrombocytopeniaProteinuriaSensitivity and specificity

Authors

Chuanchuan Sun, Zhen Yu, Qin Fan, Qingchao Chen, Feng Yu

Abstract

Background: Pregnancy-associated thrombotic microangiopathy (P-TMA) is rare but life-threatening. Early risk prediction before overt clinical presentation remains challenging, as the associated laboratory abnormalities are subtle, multidimensional, and frequently masked by common physiological changes such as gestational thrombocytopenia and pregnancy-related proteinuria, thus overlapping heavily with benign obstetric and renal conditions. This complexity is poorly captured by univariate or rule-based approaches; however, it is addressable by machine learning, which can extract latent, time-dependent risk signatures from longitudinal clinical tests. Methods: This retrospective study included 300 pregnancies comprising 142 P-TMA cases and 158 controls. After exclusion of identifiers and non-informative variables, 146 longitudinal laboratory predictors were retained. Participants were divided into a training cohort (80%) and a held-out test cohort (20%) using stratified sampling. Five algorithms were evaluated: logistic regression, support vector machine with radial basis function kernel, random forest, extra trees, and gradient boosting. The final model was selected by mean cross-validated AUROC, refitted on the full training cohort, and evaluated once in the held-out test cohort. Interpretability analyses examined global feature importance and distributional patterns of leading predictors. Results: Gradient boosting was prespecified by cross-validation in the training cohort. The model achieved an AUROC of 0.872 (95% CI: 0.769-0.952) and an AUPRC of 0.883 (95% CI: 0.780-0.959) in a held-out test cohort, with sensitivity of 0.750 and specificity of 0.812. Conclusions: Longitudinal clinical laboratory tests obtained during routine care contained informative and clinically plausible signals for P-TMA risk. Notably, cystatin C at week 6 showed promise as an early monitoring indicator.

View PDFOpen arXiv