Loading Events

Angelika Rohde (University of Freiburg, Germany)

20 January 2023 @ 12:00 - 13:00

 

  • Past event

Details

Date:
20 January 2023
Time:
12:00 - 13:00
Event Category:
Academic Events

Estimating functionals under local differential privacy


Abstract. We study the problem of estimating a functional $\theta(\mathbb{P})$ of an unknown probability distribution $\mathbb{P} \in\mathcal{P}$ in which the original iid sample $X_1,\dots, X_n$ is kept private even from the statistician via an $\alpha$-local differential privacy constraint. Let $\omega_{TV}$ denote the modulus of continuity of the functional $\theta$ over $\mathcal{P}$ with respect to total variation distance. For a large class of loss functions $l$ and a fixed privacy level $\alpha$, we prove that the privatized minimax risk is equivalent to $l(\omega_{TV}(n^{-1/2}))$ to within constants, under regularity conditions that are satisfied, in particular, if $\theta$ is linear and $\mathcal{P}$ is convex. Our results complement the theory developed by Donoho and Liu (1991 Ann. Statist. 19 633–667) with the nowadays highly relevant case of privatized data. Somewhat surprisingly, the difficulty of the estimation problem in the private case is characterized by $\omega_{TV}$, whereas, it is characterized by the Hellinger modulus of continuity if the original data $X_1,\dots, X_n$ are available. We also find that for locally private estimation of linear functionals over a convex model a simple sample mean estimator, based on independently and binary privatized observations, always achieves the minimax rate. In particular, over the larger class of so-called sequentially interactive privacy mechanisms, a non-interactive procedure attains this rate.Next, we turn to one of the most studied non-linear functionals, the quadratic functional. Here, in contrast, we show that for estimating the integrated square of a density, sequentially interactive privacy mechanisms improve substantially over the best possible non-interactive procedure in terms of minimax rate of estimation. In particular, in the non-interactive scenario we identify an elbow in the minimax rate at $s=3/4$, whereas in the sequentially interactive scenario the elbow is at $s=1/2$. This is markedly different from both, the case of direct observations, where the elbow is well-known to be at $s=1/4$, as well as from the case where Laplace noise is added to the original data, where an elbow at $s=9/4$ is obtained.