skip to main content
10.5555/2095116.2095131acmotherconferencesArticle/Chapter ViewAbstractPublication PagessodaConference Proceedingsconference-collections
research-article

Private data release via learning thresholds

Published: 17 January 2012 Publication History

Abstract

This work considers computationally efficient privacy-preserving data release. We study the task of analyzing a database containing sensitive information about individual participants. Given a set of statistical queries on the data, we want to release approximate answers to the queries while also guaranteeing differential privacy---protecting each participant's sensitive data.
Our focus is on computationally efficient data release algorithms; we seek algorithms whose running time is polynomial, or at least sub-exponential, in the data dimensionality. Our primary contribution is a computationally efficient reduction from differentially private data release for a class of counting queries, to learning thresholded sums of predicates from a related class.
We instantiate this general reduction with algorithms for learning thresholds, obtaining new results for differentially private data release. As two examples, taking {0, 1}d to be the data domain (of dimension d), we obtain differentially private algorithms for:
1. Releasing all k-way conjunction counting queries (or k-way contingency tables). For any given k, the resulting data release algorithm has bounded error as long as the database is of size at least dO [EQUATION] (ignoring the dependence on other parameters). The running time is polynomial in the database size. The best sub-exponential time algorithms known prior to our work required a database of size (dk/2) [Dwork McSherry Nissim and Smith 2006].
2. Releasing any family of counting queries that is specified by a constant depth AC0 predicate. This algorithm releases accurate answers to a (1 − γ)-fraction of the queries in the family. For any γ ≥ quasipoly(1/d), the algorithm has bounded error as long as the database is of size at least quasipoly(d) (again ignoring the dependence on other parameters). The running time is quasipoly(d).
The first learning algorithm uses techniques for representing thresholded sums of predicates as low-degree polynomial threshold functions. The second learning algorithm is based on a result of Jackson Klivans and Servedio [JKS 2002], and utilizes Fourier analysis of the database viewed as a function mapping queries to answers.

References

[1]
{Ajt83} Miklós Ajtai. Σ1{1}-formulae on finite structures. Ann. Pure Appl. Logic, 24(1):1--43, 1983.
[2]
{BCD+07} Boaz Barak, Kamalika Chaudhuri, Cynthia Dwork, Satyen Kale, Frank McSherry, and Kunal Talwar. Privacy, accuracy, and consistency too: a holistic solution to contingency table release. In Proc. 26th Symposium on Principles of Database Systems (PODS), pages 273--282. ACM, 2007.
[3]
{BCdWZ99} Harry Buhrman, Richard Cleve, Ronald de Wolf, and Christof Zalka. Bounds for small-error and zero-error quantum algorithms. In Proc. 40th Foundations of Computer Science (FOCS), pages 358--368. IEEE, 1999.
[4]
{BDMN05} Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim. Practical privacy: the SuLQ framework. In Proc. 24th Symposium on Principles of Database Systems (PODS), pages 128--138. ACM, 2005.
[5]
{BLR08} Avrim Blum, Katrina Ligett, and Aaron Roth. A learning theory approach to non-interactive database privacy. In Proc. 40th STOC, pages 609--618. ACM, 2008.
[6]
{Che66} Elliott W. Cheney. Introduction to Approximation Theory. McGraw-Hill, New York, New York, 1966.
[7]
{DMNS06} Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Proc. 3rd TCC, pages 265--284. Springer, 2006.
[8]
{DNR+09} Cynthia Dwork, Moni Naor, Omer Reingold, Guy N. Rothblum, and Salil P. Vadhan. On the complexity of differentially private data release: efficient algorithms and hardness results. In Proc. 41st STOC, pages 381--390. ACM, 2009.
[9]
{DRV10} Cynthia Dwork, Guy N. Rothblum, and Salil Vadhan. Boosting and differential privacy. In Proc. 51st Foundations of Computer Science (FOCS). IEEE, 2010.
[10]
{GHRU11} Anupam Gupta, Moritz Hardt, Aaron Roth, and Jon Ullman. Privately releasing conjunctions and the statistical query barrier. In Proc. 43nd STOC, pages 803--812. ACM, 2011.
[11]
{HR10} Moritz Hardt and Guy Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. In Proc. 51st Foundations of Computer Science (FOCS), pages 61--70. IEEE, 2010.
[12]
{HS07} Lisa Hellerstein and Rocco A. Servedio. On PAC learning algorithms for rich boolean function classes. Theoretical Computer Science, 384(1):66--76, 2007.
[13]
{Jac97} Jeffrey C. Jackson. An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. Journal of Computer and System Sciences, 55(3):414--440, 1997.
[14]
{JKS02} Jeffrey Jackson, Adam Klivans, and Rocco A. Servedio. Learnability beyond AC 0. In Proc. 34th STOC, pages 776--784. ACM, 2002.
[15]
{KLN+08} Shiva Prasad Kasiviswanathan, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. What can we learn privately? In Proc. 49th Foundations of Computer Science (FOCS), pages 531--540. IEEE, 2008.
[16]
{KM93} E. Kushilevitz and Y. Mansour. Learning decision trees using the Fourier spectrum. SIAM J. on Computing, 22(6):1331--1348, 1993.
[17]
{KMR+94} Michael J. Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire, and Linda Sellie. On the learnability of discrete distributions. In Proc. 26th STOC, pages 273--282. ACM, 1994.
[18]
{KOS04} Adam Klivans, Ryan O'Donnell, and Rocco A. Servedio. Learning intersections and thresholds of halfspaces. Journal of Computer & System Sciences, 68(4):808--840, 2004.
[19]
{KRSU10} Shiva Kasiviswanathan, Mark Rudelson, Adam Smith, and Jonathan Ullman. The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In Proc. 42nd STOC, pages 775--784. ACM, 2010.
[20]
{KS04} Adam Klivans and Rocco A. Servedio. Learning DNF in time 2 (n 1/3). Journal of Computer & System Sciences, 68(2):303--318, 2004.
[21]
{Nao96} Moni Naor. Evaluation may be easier than generation. In Proc. 28th STOC, pages 74--83. ACM, 1996.
[22]
{RR10} Aaron Roth and Tim Roughgarden. Interactive privacy via the median mechanism. In Proc. 42nd STOC, pages 765--774. ACM, 2010.
[23]
{She09} Alexander A. Sherstov. The intersection of two halfspaces has high threshold degree. In Proc. 50th Foundations of Computer Science (FOCS). IEEE, 2009.
[24]
{UV11} Jonathan Ullman and Salil P. Vadhan. Pcps and the hardness of generating private synthetic data. In TCC, pages 400--416. Springer, 2011.
[25]
{Val84} Leslie Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134--1142, 1984.

Cited By

View all
  • (2016)Order-Revealing Encryption and the Hardness of Private LearningProceedings, Part I, of the 13th International Conference on Theory of Cryptography - Volume 956210.1007/978-3-662-49096-9_8(176-206)Online publication date: 10-Jan-2016
  • (2015)Private Multiplicative Weights Beyond Linear QueriesProceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/2745754.2745755(303-312)Online publication date: 20-May-2015
  • (2015)Efficient Algorithms for Privately Releasing Marginals via Convex RelaxationsDiscrete & Computational Geometry10.1007/s00454-015-9678-x53:3(650-673)Online publication date: 1-Apr-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SODA '12: Proceedings of the twenty-third annual ACM-SIAM symposium on Discrete algorithms
January 2012
1764 pages

Sponsors

  • Kyoto University: Kyoto University

In-Cooperation

Publisher

Society for Industrial and Applied Mathematics

United States

Publication History

Published: 17 January 2012

Check for updates

Qualifiers

  • Research-article

Conference

SODA '12
Sponsor:
  • Kyoto University

Acceptance Rates

Overall Acceptance Rate 411 of 1,322 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2016)Order-Revealing Encryption and the Hardness of Private LearningProceedings, Part I, of the 13th International Conference on Theory of Cryptography - Volume 956210.1007/978-3-662-49096-9_8(176-206)Online publication date: 10-Jan-2016
  • (2015)Private Multiplicative Weights Beyond Linear QueriesProceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/2745754.2745755(303-312)Online publication date: 20-May-2015
  • (2015)Efficient Algorithms for Privately Releasing Marginals via Convex RelaxationsDiscrete & Computational Geometry10.1007/s00454-015-9678-x53:3(650-673)Online publication date: 1-Apr-2015
  • (2014)Exploiting metric structure for efficient private query releaseProceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms10.5555/2634074.2634113(523-534)Online publication date: 5-Jan-2014
  • (2014)PriViewProceedings of the 2014 ACM SIGMOD International Conference on Management of Data10.1145/2588555.2588575(1435-1446)Online publication date: 18-Jun-2014
  • (2014)Using Convex Relaxations for Efficiently and Privately Releasing MarginalsProceedings of the thirtieth annual symposium on Computational geometry10.1145/2582112.2582123(261-270)Online publication date: 8-Jun-2014
  • (2014)Faster private release of marginals on small databasesProceedings of the 5th conference on Innovations in theoretical computer science10.1145/2554797.2554833(387-402)Online publication date: 12-Jan-2014
  • (2013)Answering n{2+o(1)} counting queries with differential privacy is hardProceedings of the forty-fifth annual ACM symposium on Theory of Computing10.1145/2488608.2488653(361-370)Online publication date: 1-Jun-2013
  • (2012)Faster algorithms for privately releasing marginalsProceedings of the 39th international colloquium conference on Automata, Languages, and Programming - Volume Part I10.1007/978-3-642-31594-7_68(810-821)Online publication date: 9-Jul-2012

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media