New Paper - Joint Selection
I have a new paper out with my colleagues from UMass Amherst: Joint Selection: Adaptively Incorporating Public Information for Private Synthetic Data. Many data sources that researchers and policy makers are interested are updated through periodic releases ranging from large-scale surveys such as the Current Population Survey (CPS) to governmental administrative records. Since these datasets often contain sensitive information, it may be the case that only aggregate statistics are released or, alternatively, a synthetic dataset is constructed and released (either hopefully under differential privacy). We introduce a new method JAM-PGM
to utilize public data to improve the quality of synthetic data generated under differential privacy. In the case of periodically released datasets, public data could include prior releases. In the paper, we look at cases where the public and private data do not follow the same distribution, which is what one would expect if using such techniques in the wild.