"Privacy and utility analysis of the randomization approach in Privacy-" by Zhengli Huang

Electrical Engineering and Computer Science - Dissertations

Title

Privacy and utility analysis of the randomization approach in Privacy-Preserving Data Publishing

Author

Zhengli Huang, Syracuse University

Date of Award

2008

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical Engineering and Computer Science

Advisor(s)

Wenliang (Kevin) Du

Keywords

Privacy-preserving, Data publishing, Randomization

Subject Categories

Computer Engineering | Engineering

Abstract

Randomization has emerged as an important approach for data disguising in Privacy-Preserving Data Publishing (PPDP). Due to different data it is applied to, the randomization approach falls into into two classes: Random Perturbation (RP) for continuous data and Randomized Response (RR) for categorical data. In PPDP, utility is an important metric and referred to the preservation of data mining information, while, as a more important metric, privacy is referred to the preservation of the original information. Privacy can be determined by different aspects, such as attribute correlations, randomization parameters, etc. However, in the aspect of the attribute correlations, no one has studied whether it is a factor affecting privacy and how it affects the privacy preserving property of the randomization; in the aspect of the randomization parameters, no one has investigated how to systematically compare different randomization parameters and what the optimal randomization parameters are so that the disguised data are most privacy-preserved but still useful for data mining computations.

This thesis addresses these problems. First , we identify that a key factor to affect privacy is the correlations among attributes. We propose two data reconstruction methods that are based on continuous attribute correlations. We have analyzed the relationship between data correlations and the amount of private information that can be disclosed based on our proposed data reconstructions schemes. Our studies have shown that when the correlations are high, the original data can be reconstructed more accurately, i.e., more private information can be disclosed. To improve privacy, we propose a modified randomization scheme based on the identified factor, the attribute correlations. Our experimental results have shown that, as the improved randomization method is used, the reconstruction accuracy of both reconstruction methods becomes worse, or less private information is disclosed. Second , for RR, we formulate the quantifications of privacy and utility as estimate problems. By using the quantifications to compare different RR schemes, we employ an evolutionary multi-objective optimization method to find optimal randomization parameters of RR. The experimental results have shown that our scheme has a much better performance than the existing RR schemes. Third , for RP, we first formulate an RP technique which is more general than the existing RP technique. After generaling RP technique, we discretize the data range and use a matrix to hold the randomization parameters. We also formulate the quantifications of privacy and utility for the generalized RP technique as estimate problems. Because to measure utility is expensive, we propose an efficient approach to approximate it. According to the privacy and approximate utility metrics, we utilize an evolutionary multi-objective optimization method to find optimal randomization parameters of RP. We show that our scheme to choose the parameters has outperformed the existing scheme.

Access

Surface provides description only. Full text is available to ProQuest subscribers. Ask your Librarian for assistance.

Recommended Citation

Huang, Zhengli, "Privacy and utility analysis of the randomization approach in Privacy-Preserving Data Publishing" (2008). Electrical Engineering and Computer Science - Dissertations. 23.
https://surface.syr.edu/eecs_etd/23

http://libezproxy.syr.edu/login?url=http://proquest.umi.com/pqdweb?did=1679685771&sid=1&Fmt=2&clientId=3739&RQT=309&VName=PQD

Link to Full Text

COinS

Electrical Engineering and Computer Science - Dissertations

Title

Author

Date of Award

Degree Type

Degree Name

Department

Advisor(s)

Keywords

Subject Categories

Abstract

Access

Recommended Citation

Browse

Search

Author Resources

Links

Electrical Engineering and Computer Science - Dissertations

Title

Author

Date of Award

Degree Type

Degree Name

Department

Advisor(s)

Keywords

Subject Categories

Abstract

Access

Recommended Citation

Share

Browse

Search

Author Resources

Links