Applied & Computational Mathematics Seminar
  • Current Seminar
  • Past Seminars

The state of the art in differentially private synthetic data

Speaker: Daniel Sheldon (UMass Amherst Computer Science)

Date: 3/3/26

Abstract: Differential privacy has emerged as a leading standard for privacy protection, with significant adoption by both commercial and governmental enterprises. Many common computations on data can be made differentially private, but one of the most appealing uses of differential privacy is the generation of synthetic data. Synthetic data is intended to be broadly representative of the source data, with the goal of allowing downstream users to accurately perform a wide range of computations without further restrictions on data access. In this talk I will review both the promise and the inherent limitations of private synthetic data. I will introduce the select–measure–reconstruct paradigm, which recent benchmarks show to outperform other approaches for generating differentially private synthetic data, and explain the core ideas that make it effective. A central challenge is how to combine noisy, differentially private measurements into a coherent global model of the data distribution. I will describe Private-PGM, a key technical approach for the reconstruction phase, in which noisy statistics are reconciled into a consistent representation from which synthetic records can be sampled. Finally, I will describe AIM, a state-of-the-art generator, and conclude with recent advances and open questions.