Instructor(s): Nianqiao Phyllis Ju

Overview As data continues to be collected at unprecedented scale and about sensitive domains, questions of privacy, confidentiality, and responsible data use have become central to modern machine learning. Privacy-preserving machine learning seeks to enable useful statistical learning while rigorously limiting what can be inferred about any individual in the dataset. In this course, we will study differential privacy and explore how it interacts with learning theory, optimization, Bayesian inference, and sampling. We will study classical and cutting-edge methods for training models under privacy constraints as well as methods to make inferences based on privatized data. Throughout the course, we will focus on the fundamental trade-offs between privacy, utility, and computation.

Textbooks

(a) Dwork and Roth. 'The Algorithmic Foundations of Differential Privacy.'

(b) Murphy. 'Probabilistic Machine Learning: An Introduction.'

 

Course on canvas.dartmouth.edu.

Schedule

The first column direct to weekly recaps on the instructor's blog. Lecture notes like 'chapter1.pdf' are only available to participants on canvas.

Week  Date Label Contents
W1 1/6 W1-1

Course overview, randomized response model, definition of -DP.

Suggested reading: Chapte1.pdf

1/8 W1-2

Laplace mechanism.

Suggested reading: Dwork, Cynthia, et al. "Calibrating noise to sensitivity in private data analysis."

W2 1/13 W2-1

Exponential mechanism.

Suggested reading:

(a) McSherry, Frank, and Kunal Talwar. "Mechanism design via differential privacy."

(b) Dwork and Roth. Chapter 2.4.

1/14 W2-2a

Exponential mechanism continued.

1/15 W2-2b

Composition and post-processing.

Suggested reading: Dwork and Roth. Chapter 3.5.

W3 1/20 W3-1

Basic composition review, additive secrecy.

Suggested reading: Dwork and Roth. Chapter 2.3.

1/21 W3-2a

Approximate DP & Gaussian noise.

Suggested reading: chapter2.pdf

1/22 W3-2b

Privacy amplification by subsampling.

Suggested reading: Section 6 of https://arxiv.org/pdf/2210.00597.

W4 1/27 W4-1

Statistical decision theory recap.

Suggested reading: Murphy. Section 4.3 and 5.4. 

1/29 W4-2

Differentially private empirical risk minimization.

Suggested reading: Chaudhuri, Kamalika, Claire Monteleoni, and Anand D. Sarwate. "Differentially private empirical risk minimization."

W5 2/3 W5-1

Differentially private gradient descent.

Suggested reading: Bassily: Raef, Adam Smith, and Abhradeep Thakurta. "Private empirical risk minimization: Efficient algorithms and tight error bounds."

2/5 W5-2

Bayesian statistics and computing review.

Suggested reading: Murphy. Sections 4.6.1 and 4.6.2.

2/6

Project proposal due

W6 2/10 W6-1

Bayesian inference given privatized data

Suggested reading: Bernstein, Garrett, and Daniel R. Sheldon. "Differentially private Bayesian inference for exponential families."

2/11 W6-2a

Report-noisy-max

Suggested reading: Balog, Matej, et al. "Lost relatives of the Gumbel trick."

2/12 W6-2b

Permute-and-flip

Suggested reading: McKenna, Ryan, and Daniel R. Sheldon. "Permute-and-flip: A new mechanism for differentially private selection."

W7 2/17 W7-1a

Open problems

Suggested reading: Ju, Nianqiao et al. 'Data Augmentation MCMC for Bayesian Inference from Privatized Data.'

2/17 W7-1b

Renyi differential privacy

Suggested reading: Mironov, Ilya. "Rényi differential privacy."

2/18 W7-2a

Privacy properties of MCMC algorithms

Suggested reading: Bertazzi, Andrea, et al. "Differential privacy guarantees of Markov chain Monte Carlo algorithms."

2/19 W7-2b

Make MCMC private

Suggested reading: Heikkilä, Mikko, et al. "Differentially private markov chain monte carlo."

W8 2/24 W8-1a

Differentially private PCA
Suggested reading: Chaudhuri, Kamalika, Anand Sarwate, and Kaushik Sinha. "Near-optimal differentially private principal components."

2/24 W8-1b

Gaussian Differential Privacy

Suggested reading: Dong, Jinshuo, Aaron Roth, and Weijie J. Su. "Gaussian differential privacy."

2/25 W8-2a

Robust statistics and privacy

Suggested reading: Avella-Medina, Marco. "Privacy-preserving parametric inference: A case for robust statistics."

2/26 W8-2b

Evaluating privacy leakage in LLMs

Suggested reading: Kim, Siwon, et al. "Propile: Probing privacy leakage in large language models."

W9 3/3 W9-1a

Dan Sheldon'99 (UMass Amherst): The state of the art in differentially private synthetic data

3/4 W9-1b

Measuring the utility of synthetic data

Suggested reading: Guo, Shijie, and Jingchen Hu. "Data privacy protection and utility preservation through Bayesian data synthesis: a case study on Airbnb listings."

3/5 W9-2a

Robin Gong (Rutgers): Transparency in Data Privacy: How to Utilize it for Principled Statistical Inference

3/5 W9-2b

Jinghchen Monika Hu (Bingham): Privacy amplification for synthetic data using range restriction

W10 3/10 W10

Short project presentations