Calibrating Noise to Sensitivity in Private Data Analysis

Assume a query function $f$ applied to a database. The true answer may leak the private information of the database. How to add noise to the answer to achieve privacy?

We prove that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f. This is the maximum amount, over the domain of f, that any single argument to f, that is, any single row in the database, can change the output.

A “transcript” is an interaction between a user and a privacy mechanism — e.g. a single query function and response.

Roughly speaking, a privacy mechanism is ε-indistinguishable if for all transcripts t and for all databases x and x′ differing in a single row, the probability of obtaining transcript t when the database is x is within a (1 + ε) multiplicative factor of the probability of obtaining transcript t when the database is x . More precisely, we require the absolute value of the logarithm of the ratios to be bounded by ε. In our work, ε is a parameter chosen by policy.