AI summaryⓘ
The authors introduce a new way to think about privacy called 'privacy via predictability,' which measures how much an attacker can improve their guesses about private information after seeing a system’s output, considering what they already know. Unlike traditional differential privacy (DP), which protects against the worst possible attacker, this approach focuses on specific attackers with partial knowledge, making it more tailored and detailed. They show that predictability and DP are generally different, but in extreme cases predictability can guarantee a type of DP. The authors also develop a mathematical method to study predictability over time and propose a new privacy-preserving technique that can work together with DP to give better control over privacy.
Differential PrivacyPredictabilityPrivacy LeakageMutual InformationGeneralized Method of MomentsStochastic ProcessEmpirical Risk MinimizationOutput PerturbationStationary ProcessErgodic Process
Abstract
Differential privacy (DP) ensures rigorous individual-level privacy guarantees against even the most knowledgeable attackers, but its worst-case nature can impose a costly privacy-accuracy tradeoff. We introduce privacy via predictability, a fine-grained framework that explicitly incorporates the attacker's core knowledge, a compromised portion of the dataset generated by a stochastic process, and a specified family of queries. Predictability measures privacy leakage as the incremental gain in an attacker's ability to predict sensitive information about unknown individuals after observing the algorithm's output, beyond what can already be inferred from the compromised data. We show that predictability and DP are generally incomparable: each can be small while the other is large. However, in the worst-case regime where all but one individual is compromised, and all binary queries are considered sensitive, predictability implies mutual-information DP. More generally, predictability provides a finer-grained privacy metric tailored to specific sensitive information and specific attacker models. We introduce a general framework, using the generalized method of moments (GMM), to analyze asymptotic predictability when the compromised data is generated by a stationary, ergodic, mixing process. Using this analysis, we derive a predictability-calibrated output perturbation scheme for ERM. Our approach is complementary to DP and can be used alongside DP to provide fine-grained privacy control.