| Abstract: |
In this evaluation study, we conducted intensive experimental analysis on a model training data reconstruction attack technique recently proposed by Niv Haim et al. in 2022 [1], and evaluated the impact of different countermeasures against the attacks. Among many others, we focused on the following three questions. First, does such reconstruction attack scheme make a significant difference on the quality of reconstructed data, given the difference on the distribution of training data labels (e.g. iid vs. non-iid)? In other words, can we alleviate the damage of privacy invasion by manipulating label distribution of training data. Second, is such a model inversion approach only effective for MLP-based ML models? Can it work equally well on other training models such as CNNs? Last but not least, is differential privacy countermeasure effective to such a model inversion attack approach? And how much noise should be added to the trained model parameters to perturb them to defend against this model inversion attack without losing too much model accuracy? We evaluated different settings discussed above with same datasets (MNIST and CIFAR10) used in the original paper. From the preliminary experimental results, we found that: 1. The distribution of training data labels has little effect on the quality of reconstructed images; 2. When the training model becomes more complex, the quality of the reconstructed training data (images) obtained by this attack scheme can be effectively reduced, especially in relatively complex training image dataset (CIFAR10); 3. Differential privacy technology performs quite well against this attack method in the experimental datasets (MNIST and CIFAR10), and can effectively reduce the quality of the reconstructed data (images) at the expense of a small amount of model accuracy. We believe that our research results can provide some technical insights for researchers in the field of deep learning model training, especially for privacy protection issues. |