Attack well-trained classifiers with unknown weights within 1 gradient step



CIFAR10: plane ⟶ car



100 images train classifiers with 78.2% ± 1.1% test accuracy

to predict 45.9% ± 18.1% label plane test images as label car