Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation

2026-06-03Machine Learning

Machine Learning
AI summary

The authors address a problem in Deep Unsupervised Domain Adaptation, where it's hard to fairly compare models because there's no good way to select the best one without labeled data from the target domain. They propose a new method called Deep Embedded Validation (DEV) that uses the model's learned features to better estimate how well it will perform on the target data without needing labels. Their approach reduces errors in this estimation by using a technique called control variate, and they back up their method with both theoretical proofs and experiments.

Deep Unsupervised Domain Adaptationmodel selectiontarget risk estimationfeature representationcontrol variatevalidation procedurevariance reductionunsupervised learning
Authors
Kaichao You, Ximei Wang, Mingsheng Long, Michael I. Jordan
Abstract
Deep unsupervised domain adaptation (Deep UDA) methods successfully leverage rich labeled data in a source domain to boost the performance on related but unlabeled data in a target domain. However, algorithm comparison is cumbersome in Deep UDA due to the absence of accurate and standardized model selection method, posing an obstacle to further advances in the field. Existing model selection methods for Deep UDA are either highly biased, restricted, unstable, or even controversial (requiring labeled target data). To this end, we propose \textit{Deep Embedded Validation} (\textbf{DEV}), which embeds adapted feature representation into the validation procedure to obtain unbiased estimation of the target risk with bounded variance. The variance is further reduced by the technique of control variate. The efficacy of the method has been justified both theoretically and empirically.