Do Privacy Policies Match with the Logs? An Empirical Study of Privacy Disclosure in Android Application Logs

2026-04-20Cryptography and Security

Cryptography and SecuritySoftware Engineering
AI summary

The authors studied privacy policies of 1,000 Android apps and compared them to the data these apps actually log. They found that while most apps have privacy policies, less than a third clearly mention their logging practices. Many policies were vague or too simple about what data is recorded. Additionally, a majority of apps recorded sensitive information that wasn't disclosed in their policies. Overall, very few apps had privacy policies that matched what data they actually logged.

privacy policydata loggingAndroid applicationssensitive informationdata collectionprivacy leakageempirical studyapplication logsdata disclosure
Authors
Zhiyuan Chen, Love Jayesh Ahir, Ahmad Suleiman, Kundi Yao, Yiming Tang, Weiyi Shang, Daqing Hou
Abstract
Privacy policies are intended to inform users about how software systems collect and handle data, yet they often remain vague or incomplete. This paper presents an empirical study of patterns in log-related statements within privacy policies and their alignment with privacy disclosures observed in Android application logs. We analyzed 1,000 Android apps across multiple categories, generating 86,836,964 log entries. Our findings reveal that while most applications (88.0%) provide privacy policies, only 28.5% explicitly mention logging practices. Among those that reference logging, most clearly describe what information is logged; however, 27.7% of log-related statements remain overly simplistic or vague, offering limited insight into actual data collection. We further observed widespread privacy leakages in application logs, with 67.6% of apps leaking sensitive information not mentioned in their policies. Alarmingly, only 4% of applications demonstrated consistent alignment between declared policy contents and actual logged data. These findings highlight that current privacy policies provide incomplete or ambiguous descriptions of logging practices, which frequently do not align with actual logging behaviors.