We present Noisy Student Training, a semi-supervised learning approach that
works well even when labeled data is abundant. Noisy Student Training achieves
88.4% top-1 accuracy on ImageNet, which is 2.0% better than the
state-of-the-art model that requires 3.5B weakly labeled Instagram images. On
robustness test sets, it improves ImageNet-A top-1 accuracy from 61.0% to
83.7%, reduces ImageNet-C mean corruption error from 45.7 to 28.3, and reduces
ImageNet-P mean flip rate from 27.8 to 12.2.
Noisy Student Training extends the idea of self-training and distillation
with the use of equal-or-larger student models and noise added to the student
during learning. On ImageNet, we first train an EfficientNet model on labeled
images and use it as a teacher to generate pseudo labels for 300M unlabeled
images. We then train a larger EfficientNet as a student model on the
combination of labeled and pseudo labeled images. We iterate this process by
putting back the student as the teacher. During the learning of the student, we
inject noise such as dropout, stochastic depth, and data augmentation via
RandAugment to the student so that the student generalizes better than the
teacher. Models are available at
https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet.
Code is available at https://github.com/google-research/noisystudent.
Description
[1911.04252] Self-training with Noisy Student improves ImageNet classification
%0 Generic
%1 xie2019selftraining
%A Xie, Qizhe
%A Luong, Minh-Thang
%A Hovy, Eduard
%A Le, Quoc V.
%D 2019
%K augmentations imagenet noisy self-training student teacher
%T Self-training with Noisy Student improves ImageNet classification
%U http://arxiv.org/abs/1911.04252
%X We present Noisy Student Training, a semi-supervised learning approach that
works well even when labeled data is abundant. Noisy Student Training achieves
88.4% top-1 accuracy on ImageNet, which is 2.0% better than the
state-of-the-art model that requires 3.5B weakly labeled Instagram images. On
robustness test sets, it improves ImageNet-A top-1 accuracy from 61.0% to
83.7%, reduces ImageNet-C mean corruption error from 45.7 to 28.3, and reduces
ImageNet-P mean flip rate from 27.8 to 12.2.
Noisy Student Training extends the idea of self-training and distillation
with the use of equal-or-larger student models and noise added to the student
during learning. On ImageNet, we first train an EfficientNet model on labeled
images and use it as a teacher to generate pseudo labels for 300M unlabeled
images. We then train a larger EfficientNet as a student model on the
combination of labeled and pseudo labeled images. We iterate this process by
putting back the student as the teacher. During the learning of the student, we
inject noise such as dropout, stochastic depth, and data augmentation via
RandAugment to the student so that the student generalizes better than the
teacher. Models are available at
https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet.
Code is available at https://github.com/google-research/noisystudent.
@misc{xie2019selftraining,
abstract = {We present Noisy Student Training, a semi-supervised learning approach that
works well even when labeled data is abundant. Noisy Student Training achieves
88.4% top-1 accuracy on ImageNet, which is 2.0% better than the
state-of-the-art model that requires 3.5B weakly labeled Instagram images. On
robustness test sets, it improves ImageNet-A top-1 accuracy from 61.0% to
83.7%, reduces ImageNet-C mean corruption error from 45.7 to 28.3, and reduces
ImageNet-P mean flip rate from 27.8 to 12.2.
Noisy Student Training extends the idea of self-training and distillation
with the use of equal-or-larger student models and noise added to the student
during learning. On ImageNet, we first train an EfficientNet model on labeled
images and use it as a teacher to generate pseudo labels for 300M unlabeled
images. We then train a larger EfficientNet as a student model on the
combination of labeled and pseudo labeled images. We iterate this process by
putting back the student as the teacher. During the learning of the student, we
inject noise such as dropout, stochastic depth, and data augmentation via
RandAugment to the student so that the student generalizes better than the
teacher. Models are available at
https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet.
Code is available at https://github.com/google-research/noisystudent.},
added-at = {2021-08-07T16:09:42.000+0200},
author = {Xie, Qizhe and Luong, Minh-Thang and Hovy, Eduard and Le, Quoc V.},
biburl = {https://puma.ub.uni-stuttgart.de/bibtex/229e15eaf84a1d757c459c2ea2f1788f9/felixholm},
description = {[1911.04252] Self-training with Noisy Student improves ImageNet classification},
interhash = {f7ce3664622575bc7759a5c3c52744fc},
intrahash = {29e15eaf84a1d757c459c2ea2f1788f9},
keywords = {augmentations imagenet noisy self-training student teacher},
note = {cite arxiv:1911.04252Comment: CVPR 2020},
timestamp = {2021-08-07T14:09:42.000+0200},
title = {Self-training with Noisy Student improves ImageNet classification},
url = {http://arxiv.org/abs/1911.04252},
year = 2019
}