New Paper - Efficient and Private Marginal Reconstruction with Local Non-Negativity
I have a new paper out with my colleagues from UMass Amherst and Penn State: Efficient and Private Marginal Reconstruction with Local Non-Negativity. Marginals are statistics that capture low-dimensional structure and correlations among sets of attributes in a dataset and are an important building block for differentially private algorithms. A marginal can be decomposed into a set of queries called residuals. Our paper studies how to decompose noisy answers to marginals into noisy answers to residuals and how to recombine noisy answers to many residuals into noisy answers to marginals.
We introduce a general framework for reconstructing noisy answers to marginals from noisy answers to residuals called ReM
(Residuals-to-Marginals). GReM-MLE
is an instance of ReM
under Gaussian noise that outputs answers that are mutually consistent and that maximize the likelihood of the residuals. GReM-LNN
is an extension of GReM-MLE
that outputs answers that are mutually consistent and non-negative, which often dramatically decreases error in practice since the true marginals are non-negative. Importantly, we can efficiently compute these methods, since marginals and residuals are highly structured workloads of queries.
Abstract:
Differential privacy is the dominant standard for formal and quantifiable privacy and has been used in major deployments that impact millions of people. Many differentially private algorithms for query release and synthetic data contain steps that reconstruct answers to queries from answers to other queries measured by the mechanism. Reconstruction is an important subproblem for such mechanisms to economize the privacy budget, minimize error on reconstructed answers, and allow for scalability to high-dimensional datasets. In this paper, we introduce a principled and efficient postprocessing method ReM (Residuals-to-Marginals) for reconstructing answers to marginal queries. Our method builds on recent work on efficient mechanisms for marginal query release, based on making measurements using a residual query basis that admits efficient pseudoinversion, which is an important primitive used in reconstruction. An extension GReM-LNN (Gaussian Residuals-to-Marginals with Local Non-negativity) reconstructs marginals under Gaussian noise satisfying consistency and non-negativity, which often reduces error on reconstructed answers. We demonstrate the utility of ReM and GReM-LNN by applying them to improve existing private query answering mechanisms: ResidualPlanner and MWEM.
Click here for the preprint and here for the GitHub repo. This paper will appear at NeurIPS 2024.