Document Type

Working Paper



Embargo Period



Privacy, collaborative filtering, randomized perturbation




Computer Sciences


Collaborative Filtering (CF) techniques are becoming increasingly popular with the evolution of the Internet. E-commerce sites use CF systems to suggest products to customers based on like-minded customers' preferences. People use CF systems to cope with information overload. To conduct collaborative filtering, data from customers are needed. However, collecting high quality data from customers is not an easy task because many customers are so concerned about their privacy that they might decide to give false information. CF systems using these data might produce inaccurate recommendations. We propose a randomized perturbation technique to protect users' privacy while still producing accurate recommendations. Although the randomized perturbation techniques add randomness to the original data to prevent the data collector from learning the private user data, our scheme can still provide recommendations with decent accuracy. We conducted several experiments to compare the recommendations on the randomized data with those on the original data. Using these experiment results, we analyzed how different parameters affect the accuracy. Our results show that the CF systems using the randomized perturbation techniques provide accurate recommendations while preserving the users' privacy.