Protecting the Undeleted in Machine Unlearning

2026-02-18Machine Learning

Machine LearningData Structures and Algorithms
AI summary

The authors study machine unlearning, which is about removing specific data points from a trained model as if they were never used. They show that trying to perfectly mimic retraining without those points can actually leak private information about the remaining data. They demonstrate an attack where an adversary can recover most of the dataset by cleverly using deletion requests. To fix this, the authors propose a new security definition that prevents such privacy leaks while still allowing useful features like summation and learning.

machine unlearningperfect retrainingprivacy risksreconstruction attacksecurity definitiondata deletionstatistical learningbulletin boardsdata leakagemodel retraining
Authors
Aloni Cohen, Refael Kohen, Kobbi Nissim, Uri Stemmer
Abstract
Machine unlearning aims to remove specific data points from a trained model, often striving to emulate "perfect retraining", i.e., producing the model that would have been obtained had the deleted data never been included. We demonstrate that this approach, and security definitions that enable it, carry significant privacy risks for the remaining (undeleted) data points. We present a reconstruction attack showing that for certain tasks, which can be computed securely without deletions, a mechanism adhering to perfect retraining allows an adversary controlling merely $ω(1)$ data points to reconstruct almost the entire dataset merely by issuing deletion requests. We survey existing definitions for machine unlearning, showing they are either susceptible to such attacks or too restrictive to support basic functionalities like exact summation. To address this problem, we propose a new security definition that specifically safeguards undeleted data against leakage caused by the deletion of other points. We show that our definition permits several essential functionalities, such as bulletin boards, summations, and statistical learning.