تقرير
USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery
العنوان: | USat: A Unified Self-Supervised Encoder for Multi-Sensor Satellite Imagery |
---|---|
المؤلفون: | Irvin, Jeremy, Tao, Lucas, Zhou, Joanne, Ma, Yuntao, Nashold, Langston, Liu, Benjamin, Ng, Andrew Y. |
سنة النشر: | 2023 |
المجموعة: | Computer Science Statistics |
مصطلحات موضوعية: | Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Image and Video Processing, Statistics - Applications |
الوصف: | Large, self-supervised vision models have led to substantial advancements for automatically interpreting natural images. Recent works have begun tailoring these methods to remote sensing data which has rich structure with multi-sensor, multi-spectral, and temporal information providing massive amounts of self-labeled data that can be used for self-supervised pre-training. In this work, we develop a new encoder architecture called USat that can input multi-spectral data from multiple sensors for self-supervised pre-training. USat is a vision transformer with modified patch projection layers and positional encodings to model spectral bands with varying spatial scales from multiple sensors. We integrate USat into a Masked Autoencoder (MAE) self-supervised pre-training procedure and find that a pre-trained USat outperforms state-of-the-art self-supervised MAE models trained on remote sensing data on multiple remote sensing benchmark datasets (up to 8%) and leads to improvements in low data regimes (up to 7%). Code and pre-trained weights are available at https://github.com/stanfordmlgroup/USat . |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2312.02199 |
رقم الأكسشن: | edsarx.2312.02199 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |