Go to main content

In this dissertation, the problem of solving an empirical risk minimization (ERM) task is tackled in a differentially private (DP) way. The ultimate goal is to propose DP algorithms with high utility and efficiency under high privacy range. Several algorithms are proposed, each solving the problem through different methods under different assumptions. The first algorithm applies on strongly convex and smooth objectives. It utilizes a sensitivity calculation strategy to scale the sensitivity of the model after multiple iterations of randomly shuffled mini-batch permutation stochastic gradient descent through contraction mapping. Periodic averaging technique is applied to fasten convergence and further reduce the sensitivity to decrease the scale of noise perturbation. The second algorithm applies on convex objectives with non-smooth regularization. It utilizes a stochastic version of ADMM algorithm to split the smooth and non-smooth portions of the problem, and both sub-problems have closed-form solutions. Two versions are proposed based on this technique: One version is based on gradient perturbation, and applies the privacy amplification result to reduce the noise. The other version is based on output perturbation and sensitivity calculation. The third algorithm applies on both convex and non-convex problems. It spends a portion of the privacy budget on objective evaluations to perform adaptive step size selection through Armijo line search, and the privacy is bounded by a non-trivial utilization of the sparse vector technique. It also adjusts the per-iteration privacy budget during runtime, according to the reliability of the noisy gradient. The last algorithm is an application of differential privacy on the information retrieval task of a ranking model. By observing that only the preference labels come from the user, document vectors are considered as public information, which can be used to scale the sensitivity of the accumulated gradients. Extensive experimental results show the effectiveness of all the proposed algorithms, comparing to private baselines. This dissertation advances the state-of-the-art in the area of DP-ERM, and provide new insights of combining ERM technique and privacy protection techniques to train a machine learning model privately, which greatly benefits the machine learning practitioners to release their model with strong privacy guarantee.

Metric
From
To
Interval
Export
Download Full History