paper-conference

The effectiveness of MAE pre-pretraining for billion-scale pretraining
Scaling up MAE pre-pretraining, followed by weakly supervised pretraining, leads to strong representations.