Privacy quantification, Data publishing, Security
Privacy-Preserving Data Publishing (PPDP) deals with the publication of microdata while preserving people’s private information in the data. To measure how much private information can be preserved, privacy metrics is needed. An essential element for privacy metrics is the measure of how much adversaries can know about an individual’s sensitive attributes (SA) if they know the individual’s quasi-identifiers (QI), i.e., we need to measure P(SA | QI). Such a measure is hard to derive when adversaries’ background knowledge has to be considered. We propose a systematic approach, Privacy-MaxEnt, to integrate background knowledge in privacy quantification. Our approach is based on the maximum entropy principle. We treat all the conditional probabilities P(SA | QI) as unknown variables; we treat the background knowledge as the constraints of these variables; in addition, we also formulate constraints from the published data. Our goal becomes finding a solution to those variables (the probabilities) that satisfy all these constraints. Although many solutions may exist, the most unbiased estimate of P(SA | QI) is the one that achieves the maximum entropy.
Du, Wenliang; Teng, Zhouxuan; and Zhu, Zutao, "Privacy-maxent: integrating background knowledge in privacy quantification" (2008). Electrical Engineering and Computer Science. Paper 129.