arxivst stuff from arxiv that you should probably bookmark

Priv'IT: Private and Sample Efficient Identity Testing

Abstract · Mar 29, 2017 16:42 ·

cs-ds cs-cr cs-it cs-lg math-it math-st stat-th

Arxiv Abstract

  • Bryan Cai
  • Constantinos Daskalakis
  • Gautam Kamath

We develop differentially private hypothesis testing methods for the small sample regime. Given a sample $\cal D$ from a categorical distribution $p$ over some domain $\Sigma$, an explicitly described distribution $q$ over $\Sigma$, some privacy parameter $\varepsilon$, accuracy parameter $\alpha$, and requirements $\beta{\rm I}$ and $\beta{\rm II}$ for the type I and type II errors of our test, the goal is to distinguish between $p=q$ and $d{\rm{TV}}(p,q) \geq \alpha$. We provide theoretical bounds for the sample size $|{\cal D}|$ so that our method both satisfies $(\varepsilon,0)$-differential privacy, and guarantees $\beta{\rm I}$ and $\beta_{\rm II}$ type I and type II errors. We show that differential privacy may come for free in some regimes of parameters, and we always beat the sample complexity resulting from running the $\chi^2$-test with noisy counts, or standard approaches such as repetition for endowing non-private $\chi^2$-style statistics with differential privacy guarantees. We experimentally compare the sample complexity of our method to that of recently proposed methods for private hypothesis testing.

Read the paper (pdf) »